Most recent papers in the journal Bioinformatics

#1

JOURNAL ARTICLE

AmplificationTimeR: An R Package for Timing Sequential Amplification Events.

G Maria Jakobsdottir, Stefan C Dentro, Robert G Bristow, David C Wedge

MOTIVATION: Few methods exist for timing individual amplification events in regions of focal amplification. Current methods are also limited in the copy number states that they are able to time. Here we introduce AmplificationTimeR, a method for timing higher level copy number gains and inferring the most parsimonious order of events for regions that have undergone both single gains and whole genome duplication. Our method is an extension of established approaches for timing genomic gains...

38656989

April 24, 2024: Bioinformatics

#2

JOURNAL ARTICLE

MammalMethylClock R package: Software for DNA Methylation-Based epigenetic clocks in mammals.

Joseph Zoller, Steve Horvath

MOTIVATION: Epigenetic clocks are prediction methods based on DNA methylation levels in a given species or set of species. Defined as multivariate regression models, these DNA methylation-based biomarkers of age or mortality risk are useful in species conservation efforts and in preclinical studies. RESULTS: We present an R package called MammalMethylClock for the construction, assessment, and application of epigenetic clocks in different mammalian species. The R package includes the utility for implementing pre-existing mammalian clocks from the Mammalian Methylation Consortium...

38656974

April 24, 2024: Bioinformatics

#3

JOURNAL ARTICLE

DIMet: An open-source tool for Differential analysis of targeted Isotope-labeled Metabolomics data.

Johanna Galvis, Joris Guyon, Benjamin Dartigues, Helge Hecht, Björn Grüning, Florian Specque, Hayssam Soueidan, Slim Karkar, Thomas Daubon, Macha Nikolski

MOTIVATION: Many diseases, such as cancer, are characterized by an alteration of cellular metabolism allowing cells to adapt to changes in the microenvironment. Stable isotope-resolved metabolomics and downstream data analyses are widely used techniques for unraveling cells' metabolic activity to understand the altered functioning of metabolic pathways in the diseased state. While a number of bioinformatic solutions exist for the differential analysis of Stable Isotope-Resolved Metabolomics data, there is currently no available resource providing a comprehensive toolbox...

38656970

April 24, 2024: Bioinformatics

#4

JOURNAL ARTICLE

For antibody sequence generative modeling, mixture models may be all you need.

Jonathan Parkinson, Wei Wang

MOTIVATION: Antibody therapeutic candidates must exhibit not only tight binding to their target but also good developability properties, especially low risk of immunogenicity. RESULTS: In this work, we fit a simple generative model, SAM, to sixty million human heavy and seventy million human light chains. We show that the probability of a sequence calculated by the model distinguishes human sequences from other species with the same or better accuracy on a variety of benchmark datasets containing >400 million sequences than any other model in the literature, outperforming large language models (LLMs) by large margins...

38652603

April 23, 2024: Bioinformatics

#5

JOURNAL ARTICLE

Large-scale Structure-Informed multiple sequence alignment of proteins with SIMSApiper.

Charlotte Crauwels, Sophie-Luise Heidig, Adrián Díaz, Wim F Vranken

SUMMARY: SIMSApiper is a Nextflow pipeline that creates reliable, structure-informed MSAs of thousands of protein sequences in time-frames faster than standard structure-based alignment methods. Structural information can be provided by the user or collected by the pipeline from online resources. Parallelization with sequence identity based subsets can be activated to significantly speed up the alignment process. Finally, the number of gaps in the final alignment can be reduced by leveraging the position of conserved secondary structure elements...

38648741

April 22, 2024: Bioinformatics

#6

JOURNAL ARTICLE

Revisiting Drug-Protein Interaction Prediction: A Novel Global-Local Perspective.

Zhecheng Zhou, Qingquan Liao, Jinhang Wei, Linlin Zhuo, Xiaonan Wu, Xiangzheng Fu, Quan Zou

MOTIVATION: Accurate inference of potential Drug-protein interactions (DPIs) aids in understanding drug mechanisms and developing novel treatments. Existing deep learning models, however, struggle with accurate node representation in DPI prediction, limiting their performance. RESULTS: We propose a new computational framework that integrates global and local features of nodes in the drug-protein bipartite graph for efficient DPI inference. Initially, we employ pre-trained models to acquire fundamental knowledge of drugs and proteins and to determine their initial features...

38648052

April 22, 2024: Bioinformatics

#7

JOURNAL ARTICLE

GradHC: Highly Reliable Gradual Hash-based Clustering for DNA Storage Systems.

Dvir Ben Shabat, Adar Hadad, Avital Boruchovsky, Eitan Yaakobi

MOTIVATION: As data storage challenges grow and existing technologies approach their limits, synthetic DNA emerges as a promising storage solution due to its remarkable density and durability advantages. While cost remains a concern, emerging sequencing and synthetic technologies aim to mitigate it, yet introduce challenges such as errors in the storage and retrieval process. One crucial task in a DNA storage system is clustering numerous DNA reads into groups that represent the original input strands...

38648049

April 22, 2024: Bioinformatics

#8

JOURNAL ARTICLE

Integrative annotation scores of variants for impact on RNA binding protein activities.

Jingqi Duan, Audrey P Gasch, Sündüz Keleş

MOTIVATION: The ENCODE project generated a large collection of eCLIP-seq RNA binding protein (RBP) profiling data with accompanying RNA-seq transcriptomes of shRNA knockdown of RBPs. These data could have utility in understanding the functional impact of genetic variants, however their potential has not been fully exploited. We implement INCA (Integrative annotation scores of variants for impact on RBP activities) as a multi-step genetic variant scoring approach that leverages the ENCODE RBP data together with ClinVar and integrates multiple computational approaches to aggregate evidence...

38640488

April 18, 2024: Bioinformatics

#9

JOURNAL ARTICLE

ITree: a user-driven tool for interactive decision-making with classification trees.

Hubert Sokołowski, Marcin Czajkowski, Anna Czajkowska, Krzysztof Jurczuk, Marek Kretowski

MOTIVATION: ITree is an intuitive web tool for the manual, semi-automatic, and automatic induction of decision trees. It enables interactive modifications of tree structures and incorporates Relative Expression Analysis for detecting complex patterns in high-throughput molecular data. This makes ITree a versatile tool for both research and education in biomedical data analysis. RESULTS: The tool allows users to instantly see the effects of modifications on decision trees, with updates to predictions and statistics displayed in real time, facilitating a deeper understanding of data classification processes...

38640482

April 18, 2024: Bioinformatics

#10

JOURNAL ARTICLE

MEG-PPIS: a fast protein-protein interaction site prediction method based on multi-scale graph information and equivariant graph neural network.

Hongzhen Ding, Xue Li, Peifu Han, Xu Tian, Fengrui Jing, Shuang Wang, Tao Song, Hanjiao Fu, Na Kang

MOTIVATION: Protein-protein interaction sites (PPIS) are crucial for deciphering protein action mechanisms and related medical research, which is the key issue in protein action research. Recent studies have shown that graph neural networks have achieved outstanding performance in predicting PPIS. However, these studies often neglect the modeling of information at different scales in the graph and the symmetry of protein molecules within three-dimensional space. RESULTS: In response to this gap, this paper proposes the MEG-PPIS approach, a PPIS prediction method based on multi-scale graph information and E(n) equivariant graph neural network (EGNN)...

38640481

April 18, 2024: Bioinformatics

#11

JOURNAL ARTICLE

wgd v2: a suite of tools to uncover and date ancient polyploidy and whole-genome duplication.

Hengchi Chen, Arthur Zwaenepoel, Yves Van de Peer

MOTIVATION: Major improvements in sequencing technologies and genome sequence assembly have led to a huge increase in the number of available genome sequences. In turn, these genome sequences form an invaluable source for evolutionary, ecological, and comparative studies. One kind of analysis that has become routine is the search for traces of ancient polyploidy, particularly for plant genomes, where whole-genome duplication (WGD) is rampant. RESULTS: Here, we present a major update of a previously developed tool wgd, namely wgd v2, to look for remnants of ancient polyploidy, or WGD...

38632086

April 17, 2024: Bioinformatics

#12

JOURNAL ARTICLE

TransGEM: a molecule generation model based on transformer with gene expression data.

Yanguang Liu, Hailong Yu, Xinya Duan, Xiaomin Zhang, Ting Cheng, Feng Jiang, Hao Tang, Yao Ruan, Miao Zhang, Hongyu Zhang, Qingye Zhang

MOTIVATION: It is difficult to generate new molecules with desirable bioactivity through ligand-based de novo drug design, and receptor-based de novo drug design is constrained by disease target information availability. The combination of artificial intelligence and phenotype-based de novo drug design can generate new bioactive molecules, independent from disease target information. Gene expression profiles can be used to characterize biological phenotypes. The Transformer model can be utilized to capture the associations between gene expression profiles and molecular structures due to its remarkable ability in processing contextual information...

38632084

April 17, 2024: Bioinformatics

#13

JOURNAL ARTICLE

Peptide Set Test: a Peptide-Centric Strategy to Infer Differentially Expressed Proteins.

Junmin Wang, Steven Novick

MOTIVATION: The clinical translation of mass spectrometry-based proteomics has been challenging due to limited statistical power caused by large technical variability and inter-patient heterogeneity. Bottom-up proteomics provides an indirect measurement of proteins through digested peptides. This raises the question whether peptide measurements can be used directly to better distinguish differentially expressed proteins. RESULTS: We present a novel method called the peptide set test, which detects coordinated changes in the expression of peptides originating from the same protein and compares them to the rest of the peptidome...

38632081

April 17, 2024: Bioinformatics

#14

JOURNAL ARTICLE

Efficient cytometry analysis with FlowSOM in python boosts interoperability with other single-cell tools.

Artuur Couckuyt, Benjamin Rombaut, Yvan Saeys, Sofie Van Gassen

MOTIVATION: We describe a new Python implementation of FlowSOM, a clustering method for cytometry data. RESULTS: This implementation is faster than the original version in R, better adapted to work with single-cell omics data including integration with current single-cell data structures and includes all the original visualizations, such as the star and pie plot. AVAILABILITY: The FlowSOM Python implementation is freely available on GitHub: https://github...

38632080

April 17, 2024: Bioinformatics

#15

JOURNAL ARTICLE

GAUSS: a summary-statistics-based R package for accurate estimation of linkage disequilibrium for variants, gaussian imputation and TWAS analysis of cosmopolitan cohorts.

Donghyung Lee, Silviu-Alin Bacanu

MOTIVATION: As the availability of larger and more ethnically diverse reference panels grows, there is an increase in demand for ancestry-informed imputation of genome-wide association studies (GWAS), and other downstream analyses, e.g., fine-mapping. Performing such analyses at the genotype level is computationally challenging and necessitates, at best, a laborious process to access individual-level genotype and phenotype data. Summary-statistics-based tools, not requiring individual-level data, provide an efficient alternative that streamlines computational requirements and promotes open science by simplifying the re-analysis and downstream analysis of existing GWAS summary data...

38632050

April 17, 2024: Bioinformatics

#16

JOURNAL ARTICLE

Topological benchmarking of algorithms to infer Gene Regulatory Networks from Single-Cell RNA-seq Data.

Marco Stock, Niclas Popp, Jonathan Fiorentino, Antonio Scialdone

MOTIVATION: In recent years, many algorithms for inferring gene regulatory networks from single-cell transcriptomic data have been published. Several studies have evaluated their accuracy in estimating the presence of an interaction between pairs of genes. However, these benchmarking analyses do not quantify the algorithms' ability to capture structural properties of networks, which are fundamental, for example, for studying the robustness of a gene network to external perturbations. Here, we devise a three-step benchmarking pipeline called STREAMLINE that quantifies the ability of algorithms to capture topological properties of networks and identify hubs...

38627250

April 16, 2024: Bioinformatics

#17

JOURNAL ARTICLE

AbLEF: Antibody Language Ensemble Fusion for thermodynamically empowered property predictions.

Zachary A Rollins, Talal Widatalla, Andrew Waight, Alan C Cheng, Essam Metwally

MOTIVATION: Pre-trained protein language and/or structural models are often fine-tuned on drug development properties (ie, developability properties) to accelerate drug discovery initiatives. However, these models generally rely on a single structural conformation and/or a single sequence as a molecular representation. We present a physics-based model whereby 3D conformational ensemble representations are fused by a transformer-based architecture and concatenated to a language representation to predict antibody protein properties...

38627249

April 16, 2024: Bioinformatics

#18

JOURNAL ARTICLE

scPRAM accurately predicts single-cell gene expression perturbation response based on attention mechanism.

Qun Jiang, Shengquan Chen, Xiaoyang Chen, Rui Jiang

MOTIVATION: With the rapid advancement of single-cell sequencing technology, it becomes gradually possible to delve into the cellular responses to various external perturbations at the gene expression level. However, obtaining perturbed samples in certain scenarios may be considerably challenging, and the substantial costs associated with sequencing also curtail the feasibility of large-scale experimentation. A repertoire of methodologies has been employed for forecasting perturbative responses in single-cell gene expression...

38625746

April 15, 2024: Bioinformatics

#19

JOURNAL ARTICLE

NeoAgDT: Optimization of personal neoantigen vaccine composition by digital twin simulation of a cancer cell population.

Anja Mösch, Filippo Grazioli, Pierre Machart, Brandon Malone

MOTIVATION: Neoantigen vaccines make use of tumor-specific mutations to enable the patient's immune system to recognize and eliminate cancer. Selecting vaccine elements, however, is a complex task which needs to take into account not only the underlying antigen presentation pathway but also tumor heterogeneity. RESULTS: Here, we present NeoAgDT, a two-step approach consisting of: (1) simulating individual cancer cells to create a digital twin of the patient's tumor cell population and (2) optimizing the vaccine composition by integer linear programming based on this digital twin...

38614133

April 13, 2024: Bioinformatics

#20

JOURNAL ARTICLE

Hi-GeoMVP: a hierarchical geometry-enhanced deep learning model for drug response prediction.

Yurui Chen, Louxin Zhang

MOTIVATION: Personalized cancer treatments require accurate drug response predictions. Existing deep learning methods show promise but higher accuracy is needed to serve the purpose of precision medicine. The prediction accuracy can be improved with not only topology but geometrical information of drugs. RESULTS: A novel deep learning methodology for drug response prediction is presented, named Hi-GeoMVP. It synthesizes hierarchical drug representation with multi-omics data, leveraging graph neural networks and variational autoencoders for detailed drug and cell line representations...

38614131

April 13, 2024: Bioinformatics

Use the journals feature with a free QxMD account.

Bioinformatics

Save your favorite articles in one place with a free QxMD account.

Read

Search Tips