Papers with the keyword Bioinformatics python (Page 2)

#21

JOURNAL ARTICLE

Infrared: a declarative tree decomposition-powered framework for bioinformatics.

Hua-Ting Yao, Bertrand Marchand, Sarah J Berkemer, Yann Ponty, Sebastian Will

MOTIVATION: Many bioinformatics problems can be approached as optimization or controlled sampling tasks, and solved exactly and efficiently using Dynamic Programming (DP). However, such exact methods are typically tailored towards specific settings, complex to develop, and hard to implement and adapt to problem variations. METHODS: We introduce the Infrared framework to overcome such hindrances for a large class of problems. Its underlying paradigm is tailored toward problems that can be declaratively formalized as sparse feature networks, a generalization of constraint networks...

38493130

March 16, 2024: Algorithms for Molecular Biology: AMB

#22

JOURNAL ARTICLE

PyF2F: a robust and simplified fluorophore-to-fluorophore distance measurement tool for Protein interactions from Imaging Complexes after Translocation experiments.

Altair C Hernandez, Sebastian Ortiz, Laura I Betancur, Radovan Dojčilović, Andrea Picco, Marko Kaksonen, Baldo Oliva, Oriol Gallego

Structural knowledge of protein assemblies in their physiological environment is paramount to understand cellular functions at the molecular level. Protein interactions from Imaging Complexes after Translocation (PICT) is a live-cell imaging technique for the structural characterization of macromolecular assemblies in living cells. PICT relies on the measurement of the separation between labelled molecules using fluorescence microscopy and cell engineering. Unfortunately, the required computational tools to extract molecular distances involve a variety of sophisticated software programs that challenge reproducibility and limit their implementation to highly specialized researchers...

38486885

March 2024: NAR genomics and bioinformatics

#23

REVIEW

A comprehensive performance evaluation, comparison, and integration of computational methods for detecting and estimating cross-contamination of human samples in cancer next-generation sequencing analysis.

Huijuan Chen, Bing Wang, Lili Cai, Xiaotian Yang, Yali Hu, Yiran Zhang, Xue Leng, Wen Liu, Dongjie Fan, Beifang Niu, Qiming Zhou

Cross-sample contamination is one of the major issues in next-generation sequencing (NGS)-based molecular assays. This type of contamination, even at very low levels, can significantly impact the results of an analysis, especially in the detection of somatic alterations in tumor samples. Several contamination identification tools have been developed and implemented as a crucial quality-control step in the routine NGS bioinformatic pipeline. However, no study has been published to comprehensively and systematically investigate, evaluate, and compare these computational methods in the cancer NGS analysis...

38479675

March 12, 2024: Journal of Biomedical Informatics

#24

JOURNAL ARTICLE

scPathoQuant: A tool for efficient alignment and quantification of pathogen sequence reads from 10x single cell sequencing data sets.

Leanne S Whitmore, Jennifer Tisoncik-Go, Michael Gale

MOTIVATION: Currently there is a lack of efficient computational pipelines/tools for conducting simultaneous genome mapping of pathogen-derived and host reads from single cell RNA sequencing (scRNAseq) output from pathogen-infected cells. Contemporary options include processes involving multiple steps and/or running multiple computational tools, increasing user operations time. RESULTS: To address the need for new tools to directly map and quantify pathogen and host sequence reads from within an infected cell from scRNAseq data sets in a single operation, we have built a python package, called scPathoQuant...

38478395

March 13, 2024: Bioinformatics

#25

JOURNAL ARTICLE

Digger: Directed annotation of immunoglobulin and T cell receptor V, D and J gene sequences and assemblies.

William D Lees, Swati Saha, Gur Yaari, Corey T Watson

SUMMARY: Knowledge of immunoglobulin and T cell receptor encoding genes is derived from high-quality genomic sequencing. High throughput sequencing is delivering large volumes of data, and precise, high-throughput approaches to annotation are needed. Digger is an automated tool that identifies coding and regulatory regions of these genes, with results comparable to those obtained by current expert curational methods. AVAILABILITY AND IMPLEMENTATION: Digger is published under open source licence at https://github...

38478393

March 13, 2024: Bioinformatics

#26

JOURNAL ARTICLE

PyComplexHeatmap: a Python package to visualize multimodal genomics data.

Wubin Ding, David Goldberg, Wanding Zhou

Python has emerged as a robust programming language increasingly employed in genomics data analysis, largely due to its comprehensive deep learning libraries and proficiency in handling large-scale data, such as single-cell multi-omics datasets. Although Python has become a prominent data science ecosystem for bioinformatics, there remains a growing demand for advanced heatmap visualization and assembly tools, which are not sufficiently addressed by existing Python-based data visualization libraries. We present PyComplexHeatmap, an all-inclusive Python library for heatmap visualization, inspired by the ComplexHeatmap package currently available in R...

38454967

August 2023: Imeta

#27

JOURNAL ARTICLE

Shu: Visualization of high dimensional biological pathways.

Jorge Carrasco Muriel, Nicholas Cowie, Shannara Taylor Parkins, Marjan Mansouvar, Teddy Groves, Lars Keld Nielsen

SUMMARY: Shu is a visualization tool that integrates diverse data types into a metabolic map, with a focus on supporting multiple conditions and visualizing distributions. The goal is to provide a unified platform for handling the growing volume of multi-omics data, leveraging the metabolic maps developed by the metabolic modeling community. Additionally, shu offers a streamlined python API, based on the Grammar of Graphics, for easy integration with data pipelines. AVAILABILITY AND IMPLEMENTATION: Freely available at https://github...

38452346

March 7, 2024: Bioinformatics

#28

JOURNAL ARTICLE

The tsRNAs (tRFdb-3013a/b) serve as novel biomarkers for colon adenocarcinomas.

Lihong Tan, Xiaoling Wu, Zhurong Tang, Huan Chen, Weiguo Cao, Chunjie Wen, Guojun Zou, Hecun Zou

The tsRNAs (tRNA-derived small RNAs) are a novel class of small non-coding RNAs derived from transfer-RNAs. Colon adenocarcinoma (COAD) is the most malignant intestinal tumor. This study focused on the identification and characterization of tsRNA biomarkers in colon adenocarcinomas. Data processing and bioinformatic analyses were performed with the packages of R and Python software. The cell proliferation, migration and invasion abilities were determined by CCK-8 and transwell assays. Luciferase reporter assay was used to test the binding of tsRNA with its target genes...

38451187

March 6, 2024: Aging

#29

JOURNAL ARTICLE

pyM2aia: Python interface for mass spectrometry imaging with focus on deep learning.

Jonas Cordes, Thomas Enzlein, Carsten Hopf, Ivo Wolf

SUMMARY: Python is the most commonly used language for deep learning (DL). Existing Python packages for mass spectrometry imaging (MSI) data are not optimized for DL tasks. We, therefore, introduce pyM2aia, a Python package for MSI data analysis with a focus on memory-efficient handling, processing and convenient data-access for DL applications. pyM2aia provides interfaces to its parent application M2aia, which offers interactive capabilities for exploring and annotating MSI data in imzML format...

38445753

March 4, 2024: Bioinformatics

#30

JOURNAL ARTICLE

NPSV-deep: a deep learning method for genotyping structural variants in short read genome sequencing data.

Michael D Linderman, Jacob Wallace, Alderik van der Heyde, Eliza Wieman, Daniel Brey, Yiran Shi, Peter Hansen, Zahra Shamsi, Jeremiah Liu, Bruce D Gelb, Ali Bashir

MOTIVATION: Structural variants (SV) play a causal role in numerous diseases but can be difficult to detect and accurately genotype (determine zygosity) with short-read genome sequencing data (SRS). Improving SV genotyping accuracy in SRS data, particularly for the many SVs first detected with long-read sequencing, will improve our understanding of genetic variation. RESULTS: NPSV-deep is a deep learning-based approach for genotyping previously reported insertion and deletion SVs that recasts this task as an image similarity problem...

38444093

March 5, 2024: Bioinformatics

#31

JOURNAL ARTICLE

BERMAD: batch effect removal for single-cell RNA-seq data using a multi-layer adaptation autoencoder with dual-channel framework.

Xiangxin Zhan, Yanbin Yin, Han Zhang

MOTIVATION: Removal of batch effect between multiple datasets from different experimental platforms has become an urgent problem, since single-cell RNA sequencing (scRNA-seq) techniques developed rapidly. Although there have been some methods for this problem, most of them still face the challenge of under-correction or over-correction. Specifically, handling batch effect in highly nonlinear scRNA-seq data requires a more powerful model to address under-correction. In the meantime, some previous methods focus too much on removing difference between batches, which may disturb the biological signal heterogeneity of datasets generated from different experiments, thereby leading to over-correction...

38439545

March 4, 2024: Bioinformatics

#32

JOURNAL ARTICLE

MetaCerberus: distributed highly parallelized HMM-based processing for robust functional annotation across the tree of life.

Jose L Figueroa, Eliza Dhungel, Madeline Bellanger, Cory Brouwer, Richard Allen White

MOTIVATION: MetaCerberus is a massively parallel, fast, low memory, scalable annotation tool for inference gene function across genomes to metacommunities. MetaCerberus provides an elusive HMM/HMMER-based tool at a rapid scale with low memory. It offers scalable gene elucidation to major public databases, including KEGG (KO), COGs, CAZy, FOAM, and specific databases for viruses, including VOGs and PHROGs, from single genomes to metacommunities. RESULTS: MetaCerberus is 1...

38426351

February 29, 2024: Bioinformatics

#33

Common data models to streamline metabolomics processing and annotation, and implementation in a Python pipeline.

Joshua M Mitchell, Yuanye Chi, Maheshwor Thapa, Zhiqiang Pang, Jianguo Xia, Shuzhao Li

UNLABELLED: To standardize metabolomics data analysis and facilitate future computational developments, it is essential is have a set of well-defined templates for common data structures. Here we describe a collection of data structures involved in metabolomics data processing and illustrate how they are utilized in a full-featured Python-centric pipeline. We demonstrate the performance of the pipeline, and the details in annotation and quality control using large-scale LC-MS metabolomics and lipidomics data and LC-MS/MS data...

38405981

February 14, 2024: bioRxiv

#34

JOURNAL ARTICLE

Bioframe: Operations on genomic intervals in pandas dataframes.

Nezar Abdennur, Geoffrey Fudenberg, Ilya M Flyamer, Aleksandra A Galitsyna, Anton Goloborodko, Maxim Imakaev, Sergey Venev

MOTIVATION: Genomic intervals are one of the most prevalent data structures in computational genome biology, and used to represent features ranging from genes, to DNA binding sites, to disease variants. Operations on genomic intervals provide a language for asking questions about relationships between features. While there are excellent interval arithmetic tools for the command line, they are not smoothly integrated into Python, one of the most popular general-purpose computational and visualization environments...

38402507

February 24, 2024: Bioinformatics

#35

JOURNAL ARTICLE

Preon: Fast and accurate entity normalization for drug names and cancer types in precision oncology.

Arik Ermshaus, Michael Piechotta, Gina Rüter, Ulrich Keilholz, Ulf Leser, Manuela Benary

MOTIVATION: In precision oncology, clinicians aim to find the best treatment for any patient based on their molecular characterization. A major bottleneck is the manual annotation and evaluation of individual variants, for which usually a range of knowledge bases are screened. To incorporate and integrate the vast information of different databases, fast and accurate methods for harmonizing databases with different types of information are necessary. An essential step for harmonization in precision oncology includes the normalization of tumor entities as well as therapy options for patients...

38383060

February 21, 2024: Bioinformatics

#36

JOURNAL ARTICLE

Fast and scalable querying of eukaryotic linear motifs with gget elm.

Laura Luebbert, Chi Hoang, Manjeet Kumar, Lior Pachter

MOTIVATION: Eukaryotic linear motifs (ELMs), or Short Linear Motifs (SLiMs), are protein interaction modules that play an essential role in cellular processes and signaling networks and are often involved in diseases like cancer. The ELM database is a collection of manually curated motif knowledge from scientific papers. It has become a crucial resource for investigating motif biology and recognizing candidate ELMs in novel amino acid sequences. Users can search amino acid sequences or UniProt Accessions on the ELM resource web interface...

38377393

February 20, 2024: Bioinformatics

#37

JOURNAL ARTICLE

scEVOLVE: cell-type incremental annotation without forgetting for single-cell RNA-seq data.

Yuyao Zhai, Liang Chen, Minghua Deng

The evolution in single-cell RNA sequencing (scRNA-seq) technology has opened a new avenue for researchers to inspect cellular heterogeneity with single-cell precision. One crucial aspect of this technology is cell-type annotation, which is fundamental for any subsequent analysis in single-cell data mining. Recently, the scientific community has seen a surge in the development of automatic annotation methods aimed at this task. However, these methods generally operate at a steady-state total cell-type capacity, significantly restricting the cell annotation systems'capacity for continuous knowledge acquisition...

38366803

January 22, 2024: Briefings in Bioinformatics

#38

JOURNAL ARTICLE

Charge cluster occurrence in land plants' mitochondrial proteomes with functional and structural insights.

Imen Ayadi, Syrine Nebli, Riadh Ben Marzoug, Ahmed Rebai

The Charge Clusters (CCs) are involved in key functions and are distributed according to the organism, the protein's type, and the charge of amino acids. In the present study, we have explored the occurrence, position, and annotation as a first large-scale study of the CCs in land plants mitochondrial proteomes. A new python script was used for data curation. The Finding Clusters Charge in Protein Sequences Program was performed after adjusting the reading window size. A 44316 protein sequences belonging to 52 species of land plants were analysed...

38345014

February 12, 2024: Journal of Biomolecular Structure & Dynamics

#39

JOURNAL ARTICLE

PB-LKS: a python package for predicting phage-bacteria interaction through local K-mer strategy.

Jingxuan Qiu, Wanchun Nie, Hao Ding, Jia Dai, Yiwen Wei, Dezhi Li, Yuxi Zhang, Junting Xie, Xinxin Tian, Nannan Wu, Tianyi Qiu

Bacteriophages can help the treatment of bacterial infections yet require in-silico models to deal with the great genetic diversity between phages and bacteria. Despite the tolerable prediction performance, the application scope of current approaches is limited to the prediction at the species level, which cannot accurately predict the relationship of phages across strain mutants. This has hindered the development of phage therapeutics based on the prediction of phage-bacteria relationships. In this paper, we present, PB-LKS, to predict the phage-bacteria interaction based on local K-mer strategy with higher performance and wider applicability...

38344864

January 22, 2024: Briefings in Bioinformatics

#40

JOURNAL ARTICLE

Rapid and cost-effective epitope mapping using PURE ribosome display coupled with next-generation sequencing and bioinformatics.

Beixi Jia, Teruyo Ojima-Kato, Takaaki Kojima, Hideo Nakano

A novel, efficient and cost-effective approach for epitope identification of an antibody has been developed using a ribosome display platform. This platform, known as PURE ribosome display, utilizes an Escherichia coli-based reconstituted cell-free protein synthesis system (PURE system). It stabilizes the mRNA-ribosome-peptide complex via a ribosome-arrest peptide sequence. This system was complemented by next-generation sequencing (NGS) and an algorithm for analyzing binding epitopes. To showcase the effectiveness of this method, selection conditions were refined using the anti-PA tag monoclonal antibody with the PA tag peptide as a model...

38342664

February 10, 2024: Journal of Bioscience and Bioengineering

Use the keywords feature with a free QxMD account.

Bioinformatics python

Save your favorite articles in one place with a free QxMD account.

Read

Search Tips