Read by QxMD icon Read

BMC Bioinformatics

Juan Miguel Cejuela, Shrikant Vinchurkar, Tatyana Goldberg, Madhukar Sollepura Prabhu Shankar, Ashish Baghudana, Aleksandar Bojchevski, Carsten Uhlig, André Ofner, Pandu Raharja-Liu, Lars Juhl Jensen, Burkhard Rost
BACKGROUND: The subcellular localization of a protein is an important aspect of its function. However, the experimental annotation of locations is not even complete for well-studied model organisms. Text mining might aid database curators to add experimental annotations from the scientific literature. Existing extraction methods have difficulties to distinguish relationships between proteins and cellular locations co-mentioned in the same sentence. RESULTS: LocText was created as a new method to extract protein locations from abstracts and full texts...
January 17, 2018: BMC Bioinformatics
Marek Palkowski, Wlodzimierz Bielecki
BACKGROUND: RNA folding is an ongoing compute-intensive task of bioinformatics. Parallelization and improving code locality for this kind of algorithms is one of the most relevant areas in computational biology. Fortunately, RNA secondary structure approaches, such as Nussinov's recurrence, involve mathematical operations over affine control loops whose iteration space can be represented by the polyhedral model. This allows us to apply powerful polyhedral compilation techniques based on the transitive closure of dependence graphs to generate parallel tiled code implementing Nussinov's RNA folding...
January 15, 2018: BMC Bioinformatics
Yanhua Qiao, Yi Xiong, Hongyun Gao, Xiaolei Zhu, Peng Chen
BACKGROUND: Hot spots are interface residues that contribute most binding affinity to protein-protein interaction. A compact and relevant feature subset is important for building machine learning methods to predict hot spots on protein-protein interfaces. Although different methods have been used to detect the relevant feature subset from a variety of features related to interface residues, it is still a challenge to detect the optimal feature subset for building the final model. RESULTS: In this study, three different feature selection methods were compared to propose a new hybrid feature selection strategy...
January 15, 2018: BMC Bioinformatics
Derek S Chiu, Aline Talhouk
BACKGROUND: Given a set of features, researchers are often interested in partitioning objects into homogeneous clusters. In health research, cancer research in particular, high-throughput data is collected with the aim of segmenting patients into sub-populations to aid in disease diagnosis, prognosis or response to therapy. Cluster analysis, a class of unsupervised learning techniques, is often used for class discovery. Cluster analysis suffers from some limitations, including the need to select up-front the algorithm to be used as well as the number of clusters to generate, in addition, there may exist several groupings consistent with the data, making it very difficult to validate a final solution...
January 15, 2018: BMC Bioinformatics
Audrey Legendre, Eric Angel, Fariza Tahi
BACKGROUND: RNA structure prediction is an important field in bioinformatics, and numerous methods and tools have been proposed. Pseudoknots are specific motifs of RNA secondary structures that are difficult to predict. Almost all existing methods are based on a single model and return one solution, often missing the real structure. An alternative approach would be to combine different models and return a (small) set of solutions, maximizing its quality and diversity in order to increase the probability that it contains the real structure...
January 15, 2018: BMC Bioinformatics
Min Wang, Zachary B Abrams, Steven M Kornblau, Kevin R Coombes
BACKGROUND: Cluster analysis is the most common unsupervised method for finding hidden groups in data. Clustering presents two main challenges: (1) finding the optimal number of clusters, and (2) removing "outliers" among the objects being clustered. Few clustering algorithms currently deal directly with the outlier problem. Furthermore, existing methods for identifying the number of clusters still have some drawbacks. Thus, there is a need for a better algorithm to tackle both challenges...
January 8, 2018: BMC Bioinformatics
Yichen Zheng, Axel Janke
BACKGROUND: We evaluated the sensitivity of the D-statistic, a parsimony-like method widely used to detect gene flow between closely related species. This method has been applied to a variety of taxa with a wide range of divergence times. However, its parameter space and thus its applicability to a wide taxonomic range has not been systematically studied. Divergence time, population size, time of gene flow, distance of outgroup and number of loci were examined in a sensitivity analysis...
January 8, 2018: BMC Bioinformatics
Qi-Lei Zou, Xiao-Ping You, Jian-Long Li, Wing Kam Fung, Ji-Yuan Zhou
BACKGROUND: Genomic imprinting is one of the well-known epigenetic factors causing the association between traits and genes, and has generally been examined by detecting parent-of-origin effects of alleles. A lot of methods have been proposed to test for parent-of-origin effects on autosomes based on nuclear families and general pedigrees. Although these parent-of-origin effects tests on autosomes have been available for more than 15 years, there has been no statistical test developed to test for parent-of-origin effects on X chromosome, until the parental-asymmetry test on X chromosome (XPAT) and its extensions were recently proposed...
January 5, 2018: BMC Bioinformatics
Miguel Ángel Rodríguez-García, Robert Hoehndorf
BACKGROUND: Ontologies are representations of a conceptualization of a domain. Traditionally, ontologies in biology were represented as directed acyclic graphs (DAG) which represent the backbone taxonomy and additional relations between classes. These graphs are widely exploited for data analysis in the form of ontology enrichment or computation of semantic similarity. More recently, ontologies are developed in a formal language such as the Web Ontology Language (OWL) and consist of a set of axioms through which classes are defined or constrained...
January 5, 2018: BMC Bioinformatics
Qin Zhu, Stephen A Fisher, Hannah Dueck, Sarah Middleton, Mugdha Khaladkar, Junhyong Kim
BACKGROUND: Many R packages have been developed for transcriptome analysis but their use often requires familiarity with R and integrating results of different packages requires scripts to wrangle the datatypes. Furthermore, exploratory data analyses often generate multiple derived datasets such as data subsets or data transformations, which can be difficult to track. RESULTS: Here we present PIVOT, an R-based platform that wraps open source transcriptome analysis packages with a uniform user interface and graphical data management that allows non-programmers to interactively explore transcriptomics data...
January 5, 2018: BMC Bioinformatics
Aditya Deshpande, Wenhua Lang, Tina McDowell, Smruthy Sivakumar, Jiexin Zhang, Jing Wang, F Anthony San Lucas, Jerry Fowler, Humam Kadara, Paul Scheet
BACKGROUND: 'Next-generation' (NGS) sequencing has wide application in medical genetics, including the detection of somatic variation in cancer. The Ion Torrent-based (IONT) platform is among NGS technologies employed in clinical, research and diagnostic settings. However, identifying mutations from IONT deep sequencing with high confidence has remained a challenge. We compared various computational variant-calling methods to derive a variant identification pipeline that may improve the molecular diagnostic and research utility of IONT...
January 4, 2018: BMC Bioinformatics
Jader M Caldonazzo Garbelini, André Y Kashiwabara, Danilo S Sanches
BACKGROUND: De novo prediction of Transcription Factor Binding Sites (TFBS) using computational methods is a difficult task and it is an important problem in Bioinformatics. The correct recognition of TFBS plays an important role in understanding the mechanisms of gene regulation and helps to develop new drugs. RESULTS: We here present Memetic Framework for Motif Discovery (MFMD), an algorithm that uses semi-greedy constructive heuristics as a local optimizer. In addition, we used a hybridization of the classic genetic algorithm as a global optimizer to refine the solutions initially found...
January 3, 2018: BMC Bioinformatics
Peng Guo, Bo Zhu, Hong Niu, Zezhao Wang, Yonghu Liang, Yan Chen, Lupei Zhang, Hemin Ni, Yong Guo, El Hamidi A Hay, Xue Gao, Huijiang Gao, Xiaolin Wu, Lingyang Xu, Junya Li
BACKGROUND: Running multiple-chain Markov Chain Monte Carlo (MCMC) provides an efficient parallel computing method for complex Bayesian models, although the efficiency of the approach critically depends on the length of the non-parallelizable burn-in period, for which all simulated data are discarded. In practice, this burn-in period is set arbitrarily and often leads to the performance of far more iterations than required. In addition, the accuracy of genomic predictions does not improve after the MCMC reaches equilibrium...
January 3, 2018: BMC Bioinformatics
Fanchi Meng, Chen Wang, Lukasz Kurgan
BACKGROUND: Development of predictors of propensity of protein sequences for successful crystallization has been actively pursued for over a decade. A few novel methods that expanded the scope of these predictions to address additional steps of protein production and structure determination pipelines were released in recent years. The predictive performance of the current methods is modest. This is because the only input that they use is the protein sequence and since the experimental annotations of these data might be inconsistent given that they were collected across many laboratories and centers...
January 3, 2018: BMC Bioinformatics
Anna Marco-Ramell, Magali Palau-Rodriguez, Ania Alay, Sara Tulipani, Mireia Urpi-Sarda, Alex Sanchez-Pla, Cristina Andres-Lacueva
BACKGROUND: Bioinformatic tools for the enrichment of 'omics' datasets facilitate interpretation and understanding of data. To date few are suitable for metabolomics datasets. The main objective of this work is to give a critical overview, for the first time, of the performance of these tools. To that aim, datasets from metabolomic repositories were selected and enriched data were created. Both types of data were analysed with these tools and outputs were thoroughly examined. RESULTS: An exploratory multivariate analysis of the most used tools for the enrichment of metabolite sets, based on a non-metric multidimensional scaling (NMDS) of Jaccard's distances, was performed and mirrored their diversity...
January 2, 2018: BMC Bioinformatics
M Shi, D M Umbach, A S Wise, C R Weinberg
BACKGROUND: To evaluate statistical methods for genome-wide genetic analyses, one needs to be able to simulate realistic genotypes. We here describe a method, applicable to a broad range of association study designs, that can simulate autosome-wide single-nucleotide polymorphism data with realistic linkage disequilibrium and with spiked in, user-specified, single or multi-SNP causal effects. RESULTS: Our construction uses existing genome-wide association data from unrelated case-parent triads, augmented by including a hypothetical complement triad for each triad (same parents but with a hypothetical offspring who carries the non-transmitted parental alleles)...
January 2, 2018: BMC Bioinformatics
Zhen Tian, Maozu Guo, Chunyu Wang, Xiaoyan Liu, Shiming Wang
BACKGROUND: In recent years, biological interaction networks have become the basis of some essential study and achieved success in many applications. Some typical networks such as protein-protein interaction networks have already been investigated systematically. However, little work has been available for the construction of gene functional similarity networks so far. In this research, we will try to build a high reliable gene functional similarity network to promote its further application...
December 28, 2017: BMC Bioinformatics
Ethan C Rath, Stephanie Pitman, Kyu Hong Cho, Yongsheng Bai
BACKGROUND: Small noncoding regulatory RNAs (sRNAs) are post-transcriptional regulators, regulating mRNAs, proteins, and DNA in bacteria. One class of sRNAs, trans-acting sRNAs, are the most abundant sRNAs transcribed from the intergenic regions (IGRs) of the bacterial genome. In Streptococcus pyogenes, a common and potentially deadly pathogen, many sRNAs have been identified, but only a few have been studied. The goal of this study is to identify trans-acting sRNAs that can be substrates of RNase III...
December 28, 2017: BMC Bioinformatics
Haiyong Zheng, Ruchen Wang, Zhibin Yu, Nan Wang, Zhaorui Gu, Bing Zheng
BACKGROUND: Plankton, including phytoplankton and zooplankton, are the main source of food for organisms in the ocean and form the base of marine food chain. As the fundamental components of marine ecosystems, plankton is very sensitive to environment changes, and the study of plankton abundance and distribution is crucial, in order to understand environment changes and protect marine ecosystems. This study was carried out to develop an extensive applicable plankton classification system with high accuracy for the increasing number of various imaging devices...
December 28, 2017: BMC Bioinformatics
Clarence White, Hamid D Ismail, Hiroto Saigo, Dukka B Kc
BACKGROUND: The β-Lactamase (BL) enzyme family is an important class of enzymes that plays a key role in bacterial resistance to antibiotics. As the newly identified number of BL enzymes is increasing daily, it is imperative to develop a computational tool to classify the newly identified BL enzymes into one of its classes. There are two types of classification of BL enzymes: Molecular Classification and Functional Classification. Existing computational methods only address Molecular Classification and the performance of these existing methods is unsatisfactory...
December 28, 2017: BMC Bioinformatics
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"