Read by QxMD icon Read


Jayendra Shinde, Quentin Bayard, Sandrine Imbeaud, Théo Z Hirsch, Feng Liu, Victor Renault, Jessica Zucman-Rossi, Eric Letouzé
Summary: Cancer genomes are altered by various mutational processes and, like palimpsests, bear the signatures of these different processes. The Palimpsest R package provides a complete workflow for the characterization and visualization of mutational signatures and their evolution along tumor development. The package covers a wide range of functions for extracting both base substitution and structural variant signatures, inferring the clonality of each alteration and analyzing the evolution of mutational processes between early clonal and late subclonal events...
May 16, 2018: Bioinformatics
Hao Jiang, Lydia Sohn, Haiyan Huang, Luonan Chen
Motivation: The rapid advancement of single cell technologies has shed new light on the complex mechanisms of cellular heterogeneity. Identification of intercellular transcriptomic heterogeneity is one of the most critical tasks in single-cell RNA-sequencing studies. Results: We propose a new cell similarity measure based on cell-pair differentiability correlation, which is derived from gene differential pattern among all cell pairs. Through plugging into the framework of hierarchical clustering with this new measure, we further develop a variance analysis based clustering algorithm 'Corr' that can determine cluster number automatically and identify cell types accurately...
May 16, 2018: Bioinformatics
Huijun Mai, Yifan Zhang, Dinghua Li, Henry Chi-Ming Leung, Ruibang Luo, Chi-Kwong Wong, Hing-Fung Ting, Tak-Wah Lam
Summary: AC-DIAMOND (v1) is a DNA-protein alignment tool designed to tackle the efficiency challenge of aligning large amount of reads or contigs to protein databases. When compared with the previously most efficient method DIAMOND, AC-DIAMOND gains a 6- to 7-fold speed-up, while retaining a similar degree of sensitivity. The improvement is rooted at two aspects: first, using a compressed index of seeds with adaptive-length to speed up the matching between query and reference sequences; second, adopting a compact form of dynamic programming to fully utilize the parallelism of the SIMD capability...
May 16, 2018: Bioinformatics
Lixin Cheng, Kwong-Sak Leung
Motivation: Moonlighting proteins are a class of proteins having multiple distinct functions, which play essential roles in a variety of cellular and enzymatic functioning systems. Although there have long been calls for computational algorithms for the identification of moonlighting proteins, research on approaches to identify moonlighting long non-coding RNAs (lncRNAs) has never been undertaken. Here, we introduce a novel methodology, MoonFinder, for the identification of moonlighting lncRNAs...
May 16, 2018: Bioinformatics
Louis-Marie Bobay, Brian Shin-Hua Ellis, Howard Ochman
Summary: Classification of prokaryotic species is usually based on sequence similarity thresholds, which are easy to apply but lack a biologically-relevant foundation. Here, we present ConSpeciFix, a program that classifies prokaryotes into species using criteria set forth by the Biological Species Concept, thereby unifying species definition in all domains of life. Availability and implementation: ConSpeciFix's webserver is freely available at
May 16, 2018: Bioinformatics
Emmanuel Klinger, Dennis Rickert, Jan Hasenauer
Summary: Likelihood-free methods are often required for inference in systems biology. While Approximate Bayesian Computation (ABC) provides a theoretical solution, its practical application has often been challenging due to its high computational demands. To scale likelihood-free inference to computationally demanding stochastic models we developed pyABC: a distributed and scalable ABC-Sequential Monte Carlo (ABC-SMC) framework. It implements a scalable, runtime-minimizing parallelization strategy for multi-core and distributed environments scaling to thousands of cores...
May 14, 2018: Bioinformatics
Wandrille Duchemin, Guillaume Gence, Anne-Muriel Arigon Chifolleau, Lars Arvestad, Mukul S Bansal, Vincent Berry, Bastien Boussau, François Chevenet, Nicolas Comte, Adrián A Davín, Christophe Dessimoz, David Dylus, Damir Hasic, Diego Mallo, Rémi Planel, David Posada, Celine Scornavacca, Gergely Szöllosi, Louxin Zhang, Éric Tannier, Vincent Daubin
Motivation: A reconciliation is an annotation of the nodes of a gene tree with evolutionary events-for example, speciation, gene duplication, transfer, loss, etc-along with a mapping onto a species tree. Many algorithms and software produce or use reconciliations but often using different reconciliation formats, regarding the type of events considered or whether the species tree is dated or not. This complicates the comparison and communication between different programs. Results: Here, we gather a consortium of software developers in gene tree species tree reconciliation to propose and endorse a format that aims to promote an integrative-albeit flexible-specification of phylogenetic reconciliations...
May 14, 2018: Bioinformatics
Jonathan D Tyzack, Antonio J M Ribeiro, Neera Borkakoti, Janet M Thornton
Motivation: One goal of synthetic biology is to make new enzymes to generate new products, but identifying the starting enzymes for further investigation is often elusive and relies on expert knowledge, intensive literature searching and trial and error. Results: We present Transform-MinER, an online computational tool that transforms query substrate molecules into products using enzyme reactions. The most similar native enzyme reactions for each transformation are found, highlighting those that may be of most interest for enzyme design and directed evolution approaches...
May 14, 2018: Bioinformatics
Valérie Marot-Lassauzaie, Michael Bernhofer, Burkhard Rost
Motivation: Many applications monitor predictions of a whole range of features for biological datasets, e.g. the fraction of secreted human proteins in the human proteome. Results and error estimates are typically derived from publications. Results: Here, we present a simple, alternative approximation that uses performance estimates of methods to error-correct the predicted distributions. This approximation uses the confusion matrix (TP true positives, TN true negatives, FP false positives, and FN false negatives) describing the performance of the prediction tool for correction...
May 14, 2018: Bioinformatics
Kathrin M Seibt, Thomas Schmidt, Tony Heitkam
Summary: FlexiDot is a cross-platform dotplot suite generating high quality self, pairwise and all-against-all visualizations. To improve dotplot suitability for comparison of consensus and error-prone sequences, FlexiDot harbors routines for strict and relaxed handling of ambiguities and substitutions. Our shading modules facilitate dotplot interpretation and motif identification by adding information on sequence annotations and sequence similarities. Combined with collage-like outputs, FlexiDot supports simultaneous visual screening of large sequence sets, enabling dotplot use for routine analyses...
May 14, 2018: Bioinformatics
Will P M Rowe, Martyn D Winn
Motivation: Antimicrobial resistance remains a major threat to global health. Profiling the collective antimicrobial resistance genes within a metagenome (the "resistome") facilitates greater understanding of antimicrobial resistance gene diversity and dynamics. In turn, this can allow for gene surveillance, individualised treatment of bacterial infections and more sustainable use of antimicrobials. However, resistome profiling can be complicated by high similarity between reference genes, as well as the sheer volume of sequencing data and the complexity of analysis workflows...
May 14, 2018: Bioinformatics
Pallab Bhowmick, Yassene Mohammed, Christoph H Borchers
Motivation: Multiple Reaction Monitoring (MRM)-based targeted proteomics is increasingly being used to study the molecular basis of disease. When combined with an internal standard, MRM allows absolute quantification of proteins in virtually any type of sample but the development and validation of an MRM assay for a specific protein is laborious. Therefore, several public repositories now host targeted proteomics MRM assays, including NCI's Clinical Proteomic Tumor Analysis Consortium assay portals, PeptideAtlas SRM Experiment Library, SRMAtlas, PanoramaWeb, and PeptideTracker, with all of which contain different levels of information...
May 14, 2018: Bioinformatics
Phit Ling Tan, Yosvany López, Kenta Nakai, Ashwini Patil
Summary: Condition-specific time-course omics profiles are frequently used to study cellular response to stimuli and identify associated signaling pathways. However, few online tools allow users to analyze multiple types of high-throughput time-course data. TimeXNet Web is a web server that extracts a time-dependent gene/protein response network from time-course transcriptomic, proteomic or phospho-proteomic data, and an input interaction network. It classifies the given genes/proteins into time-dependent groups based on the time of their highest activity and identifies the most probable paths connecting genes/proteins in consecutive groups...
May 14, 2018: Bioinformatics
Le Zhang, Ming Xiao, Jingsong Zhou, Jun Yu
Motivation: The present study addresses several important questions related to naturally underrepresented sequences: (1) Are there permutations of real genomic DNA sequences in a defined length (k-mer) and a given lineage that do not actually exist or underrepresented? (2) If there are such sequences, what are their characteristics in terms of k-mer length and base composition? Are they related to CpG or TpA underrepresentation known for human sequences? We propose that the answers to these questions are of great significance for the study of sequence-associated regulatory mechanisms, such cytosine methylation and chromosomal structures in physiological or pathological conditions such as cancer...
May 14, 2018: Bioinformatics
Stefano Nembrini, Inke R König, Marvin N Wright
Motivation: Random forests are fast, flexible and represent a robust approach to analyze high dimensional data. A key advantage over alternative machine learning algorithms are variable importance measures, which can be used to identify relevant features or perform variable selection. Measures based on the impurity reduction of splits, such as the Gini importance, are popular because they are simple and fast to compute. However, they are biased in favor of variables with many possible split points and high minor allele frequency...
May 10, 2018: Bioinformatics
Marta M Stepniewska-Dziubinska, Piotr Zielenkiewicz, Pawel Siedlecki
Motivation: Structure based ligand discovery is one of the most successful approaches for augmenting the drug discovery process. Currently, there is a notable shift towards machine learning (ML) methodologies to aid such procedures. Deep learning has recently gained considerable attention as it allows the model to "learn" to extract features that are relevant for the task at hand. Results: We have developed a novel deep neural network estimating the binding affinity of ligand-receptor complexes...
May 10, 2018: Bioinformatics
Sirajul Salekin, Jianqiu Michelle Zhang, Yufei Huang
Motivation: Transcription factor (TF) binds to the promoter region of a gene to control gene expression. Identifying precise transcription factor binding sites (TFBS) is essential for understanding the detailed mechanisms of TF mediated gene regulation. However, there is a shortage of computational approach that can deliver single base pair (bp) resolution prediction of TFBS. Results: In this paper, we propose DeepSNR, a Deep Learning algorithm for predicting transcription factor binding location at Single Nucleotide Resolution de novo from DNA sequence...
May 10, 2018: Bioinformatics
Heng Li
Motivation: Recent advances in sequencing technologies promise ultra-long reads of ∼100 kilo bases (kb) in average, full-length mRNA or cDNA reads in high throughput and genomic contigs over 100 mega bases (Mb) in length. Existing alignment programs are unable or inefficient to process such data at scale, which presses for the development of new alignment algorithms. Results: Minimap2 is a general-purpose alignment program to map DNA or long mRNA sequences against a large reference database...
May 10, 2018: Bioinformatics
Laraib Malik, Fatemeh Almodaresi, Rob Patro
Motivation: De novo transcriptome analysis using RNA-seq offers a promising means to study gene expression in non-model organisms. Yet, the difficulty of transcriptome assembly means that the contigs provided by the assembler often represent a fractured and incomplete view of the transcriptome, complicating downstream analysis. We introduce Grouper, a new method for clustering contigs from de novo assemblies that are likely to belong to the same transcripts and genes; these groups can subsequently be analyzed more robustly...
May 8, 2018: Bioinformatics
R Biczok, P Bozsoky, P Eisenmann, J Ernst, T Ribizel, F Scholz, A Trefzer, F Weber, M Hamann, A Stamatakis
Motivation: The presence of terraces in phylogenetic tree space, that is, a potentially large number of distinct tree topologies that have exactly the same analytical likelihood score, was first described by Sanderson et al. (2011). However, popular software tools for maximum likelihood and Bayesian phylogenetic inference do not yet routinely report, if inferred phylogenies reside on a terrace, or not. We believe, this is due to the lack of an efficient library to (i) determine if a tree resides on a terrace, (ii) calculate how many trees reside on a terrace, and (iii) enumerate all trees on a terrace...
May 8, 2018: Bioinformatics
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"