Niki L Dimou, Konstantinos D Tsirigos, Arne Elofsson, Pantelis G Bagos
MOTIVATION: In the context of genome-wide association studies (GWAS), there is a variety of statistical techniques in order to conduct the analysis, but, in most cases, the underlying genetic model is usually unknown. Under these circumstances, the classical Cochran-Armitage trend test is suboptimal. Robust procedures that maximize the power and preserve the nominal type I error rate are preferable. Moreover, performing a meta-analysis using robust procedures is of great interest and has never been addressed in the past...
January 20, 2017: Bioinformatics
Jing Guo, Feng Lin, Xiaomeng Zhang, Vivek Tanavde, Jie Zheng
: Waddington's epigenetic landscape is a powerful metaphor for cellular dynamics driven by gene regulatory networks. Its quantitative modeling and visualization, however, remains a challenge, especially when there are more than two genes in the network. A software tool for Waddington's landscape has not been available in the literature. We present NetLand, an open-source software tool for modeling and simulating the kinetic dynamics of gene regulatory networks (GRNs), and visualizing the corresponding Waddington's epigenetic landscape in three dimensions without restriction on the number of genes in a GRN...
January 19, 2017: Bioinformatics
Ruby Peters, Marta Benthem Muñiz, Juliette Griffié, David J Williamson, George W Ashdown, Christian D Lorenz, Dylan M Owen
MOTIVATION: Unlike conventional microscopy which produces pixelated images, SMLM produces data in the form of a list of localization coordinates - a spatial point pattern (SPP). Often, such SPPs are analyzed using cluster analysis algorithms to quantify molecular clustering within, for example, the plasma membrane. While SMLM cluster analysis is now well developed, techniques for analyzing fibrous structures remain poorly explored. RESULTS: Here, we demonstrate statistical methodology, based on Ripley's K-function to quantitatively assess fibrous structures in 2D SMLM data sets...
January 19, 2017: Bioinformatics
Arnald Alonso, Brittany N Lasseigne, Kelly Williams, Josh Nielsen, Ryne C Ramaker, Andrew A Hardigan, Bobbi Johnston, Brian S Roberts, Sara J Cooper, Sara Marsal, Richard M Myers
The wide range of RNA-seq applications and their high computational needs require the development of pipelines orchestrating the entire workflow and optimizing usage of available computational resources. We present aRNApipe, a project-oriented pipeline for processing of RNA-seq data in high performance cluster environments. aRNApipe is highly modular and can be easily migrated to any high performance computing (HPC) environment. The current applications included in aRNApipe combine the essential RNA-seq primary analyses, including quality control metrics, transcript alignment, count generation, transcript fusion identification, alternative splicing, and sequence variant calling...
January 19, 2017: Bioinformatics
Bharat Panwar, Gilbert S Omenn, Yuanfang Guan
MOTIVATION: MicroRNAs (miRNAs) are small non-coding RNAs that are involved in post-transcriptional regulation of gene expression. In this high-throughput sequencing era, a tremendous amount of RNA-seq data is accumulating, and full utilization of publicly available miRNA data is an important challenge. These data are useful to determine expression values for each miRNA, but quantification pipelines are in a primitive stage and still evolving; there are many factors that affect expression values significantly...
January 19, 2017: Bioinformatics
Hernando G Suarez, Bjoern E Langer, Pradnya Ladde, Michael Hiller
MOTIVATION: Accurate alignments between entire genomes are crucial for comparative genomics. However, computing sensitive and accurate genome alignments is a challenging problem, complicated by genomic rearrangements. RESULTS: Here we present a fast approach, called chainCleaner, that improves the specificity in genome alignments by accurately detecting and removing local alignments that obscure the evolutionary history of genomic rearrangements. Systematic tests on alignments between the human and other vertebrate genomes show that chainCleaner (i) improves the alignment of numerous orthologous genes, (ii) exposes alignments between exons of orthologous genes that were masked before by alignments to pseudogenes, and (iii) recovers hundreds of kilobases in local alignments that otherwise would fall below a minimum score threshold...
January 19, 2017: Bioinformatics
P Kapli, S Lutteropp, J Zhang, K Kobert, P Pavlidis, A Stamatakis, T Flouri
MOTIVATION: In recent years, molecular species delimitation has become a routine approach for quantifying and classifying biodiversity. Barcoding methods are of particular importance in large-scale surveys as they promote fast species discovery and biodiversity estimates. Among those, distance-based methods are the most common choice as they scale well with large datasets; however, they are sensitive to similarity threshold parameters and they ignore evolutionary relationships. The recently introduced "Poisson Tree Processes" (PTP) method is a phylogeny-aware approach that does not rely on such thresholds...
January 19, 2017: Bioinformatics
Ian H Holmes
MOTIVATION: Reconstruction of ancestral sequence histories, and estimation of parameters like indel rates, are improved by using explicit evolutionary models and summing over uncertain alignments. The previous best tool for this purpose (according to simulation benchmarks) was ProtPal, but this tool was too slow for practical use. RESULTS: Historian combines an efficient reimplementation of the ProtPal algorithm with performance-improving heuristics from other alignment tools...
January 18, 2017: Bioinformatics
Marco Necci, Damiano Piovesan, Zsuzsanna Dosztányi, Silvio C E Tosatto
MOTIVATION: Intrinsic disorder (ID) is established as an important feature of protein sequences. Its use in proteome annotation is however hampered by the availability of many methods with similar performance at the single residue level, which have mostly not been optimized to predict long ID regions of size comparable to domains. Here, we have focused on providing a single consensus-based prediction, MobiDB-lite, optimized for highly specific (i.e. few false positive) predictions of long disorder...
January 18, 2017: Bioinformatics
Surya Gupta, Veronic De Puysseleyr, José Van der Heyden, Davy Maddelein, Irma Lemmens, Sam Lievens, Sven Degroeve, Jan Tavernier, Lennart Martens
: Protein-protein interaction (PPI) studies have dramatically expanded our knowledge about cellular behaviour and development in different conditions. A multitude of high-throughput PPI techniques have been developed to achieve proteome-scale coverage for PPI studies, including the microarray based Mammalian Protein-Protein Interaction Trap (MAPPIT) system. Because such high-throughput techniques typically report thousands of interactions, managing and analysing the large amounts of acquired data is a challenge...
January 18, 2017: Bioinformatics
Mateusz Kaduk, Erik Sonnhammer
MOTIVATION: The initial step in many orthology inference methods is the computationally demanding establishment of all pairwise protein similarities across all analysed proteomes. The quadratic scaling with proteomes has become a major bottleneck. A remedy is offered by the Hieranoid algorithm which reduces the complexity to linear by hierarchically aggregating ortholog groups from InParanoid along a species tree. RESULTS: We have further developed the Hieranoid algorithm in many ways...
January 17, 2017: Bioinformatics
Inuk Jung, Kyuri Jo, Hyejin Kang, Hongryul Ahn, Youngjae Yu, Sun Kim
MOTIVATION: Identifying biologically meaningful gene expression patterns from time series gene expression data is important to understand the underlying biological mechanisms. To identify significantly perturbed gene sets between different phenotypes, analysis of time series transcriptome data requires consideration of time and sample dimensions. Thus, the analysis of such time series data seeks to search gene sets that exhibit similar or different expression patterns between two or more sample conditions, constituting the three-dimensional data, i...
January 17, 2017: Bioinformatics
Xujun Liang, Pengfei Zhang, Lu Yan, Ying Fu, Fang Peng, Lingzhi Qu, Meiying Shao, Yongheng Chen, Zhuchu Chen
MOTIVATION: Exploring the potential curative effects of drugs is crucial for effective drug development. Previous studies have indicated that integration of multiple types of information could be conducive to discovering novel indications of drugs. However, how to efficiently identify the mechanism behind drug-disease associations while integrating data from different sources remains a challenging problem. RESULTS: In this research, we present a novel method for indication prediction of both new drugs and approved drugs...
January 17, 2017: Bioinformatics
Umberto Ferraro Petrillo, Gianluca Roscigno, Giuseppe Cattaneo, Raffaele Giancarlo
: MapReduce Hadoop bioinformatics applications require the availability of special-purpose routines to manage the input of sequence files. Unfortunately, the Hadoop framework does not provide any built-in support for the most popular sequence file formats like FASTA or BAM. Moreover, the development of these routines is not easy, both because of the diversity of these formats and the need for managing efficiently sequence datasets that may count up to billions of characters. We present FASTdoop, a generic Hadoop library for the management of FASTA and FASTQ files...
January 16, 2017: Bioinformatics
Bradley C Naylor, Michael T Porter, Elise Wilson, Adam Herring, Spencer Lofthouse, Austin Hannemann, Stephen R Piccolo, Alan L Rockwood, John C Price
MOTIVATION: Using mass spectrometry to measure the concentration and turnover of the individual proteins in a proteome, enables the calculation of individual synthesis and degradation rates for each protein. Software to analyze concentration is readily available, but software to analyze turnover is lacking. Data analysis workflows typically don't access the full breadth of information about instrument precision and accuracy that is present in each peptide isotopic envelope measurement...
January 16, 2017: Bioinformatics
Daniel E Cook, Erik C Andersen
: The variant call format (VCF) is a popular standard for storing genetic variation data. As a result, a large collection of tools has been developed that perform diverse analyses using VCF files. However, some tasks common to statistical and population geneticists have not been created yet. To streamline these types of analyses, we created novel tools that analyze or annotate VCF files and organized these tools into a command-line based utility named VCF-kit. VCF-kit adds essential utilities to process and analyze VCF files, including primer generation for variant validation, dendrogram production, genotype imputation from sequence data in linkage studies, and additional tools...
January 16, 2017: Bioinformatics
Quan Le, Fabian Sievers, Desmond G Higgins
MOTIVATION: Multiple sequence alignment (MSA) is commonly used to analyse sets of homologous protein or DNA sequences. This has lead to the development of many methods and packages forMSAover the past 30 years. Being able to compare different methods has been problematic and has relied on gold standard benchmark datasets of 'true' alignments or on MSA simulations. A number of protein benchmark datasets have been produced which rely on a combination of manual alignment and/or automated superposition of protein structures...
January 16, 2017: Bioinformatics
Claire Marks, Jaroslaw Nowak, Stefan Klostermann, Guy Georges, James Dunbar, Jiye Shi, Sebastian Kelm, Charlotte M Deane
MOTIVATION: Loops are often vital for protein function, however their irregular structures make them difficult to model accurately. Current loop modelling algorithms can mostly be divided into two categories: knowledge-based, where databases of fragments are searched to find suitable conformations; and ab initio, where conformations are generated computationally. Existing knowledge-based methods only use fragments that are the same length as the target, even though loops of slightly different lengths may adopt similar conformations...
January 16, 2017: Bioinformatics
Laurent Heirendt, Ines Thiele, Ronan M T Fleming
MOTIVATION: Flux balance analysis, and its variants, are widely used methods for predicting steady-state reaction rates in biochemical reaction networks. The exploration of high dimensional networks with such methods is currently hampered by software performance limitations. RESULTS: DistributedFBA.jl is a high-level, high-performance, open-source implementation of flux balance analysis in Julia. It is tailored to solve multiple flux balance analyses on a subset or all the reactions of large and huge-scale networks, on any number of threads or nodes...
January 16, 2017: Bioinformatics
Masae Hosoda, Yukie Akune, Kiyoko F Aoki-Kinoshita
MOTIVATION: A glycan consists of monosaccharides linked by glycosidic bonds, has branches and forms complex molecular structures. Databases have been developed to store large amounts of glycan-binding experiments, including glycan arrays with glycan binding proteins (GBPs). However, there are few bioinformatics techniques to analyze large amounts of data for glycans because there are few tools that can handle the complexity of glycan structures. Thus, we have developed the MCAW (Multiple Carbohydrate Alignment with Weights) tool that can align multiple glycan structures, to aid in the understanding of their function as binding recognition molecules...
January 16, 2017: Bioinformatics

(heart or cardiac or cardio*) AND arrest -"American Heart Association"