Read by QxMD icon Read

IEEE/ACM Transactions on Computational Biology and Bioinformatics

Abazar Arabameri, Davud Asemani, Pegah Teymourpour
To obtain a screening tool for colorectal cancer (CRC) based on gut microbiota, we seek here to identify an optimal classifier for CRC detection as well as a novel nonlinear feature selection method for determining the most discriminative microbial species. In this study, the intestinal microflora in feces of 141 patients were modeled using general regression neural networks (GRNNs) combined with the proposed feature selection method. The proposed model led to slightly higher accuracy (AUC=0.911) than previous studies (AUC<0...
September 13, 2018: IEEE/ACM Transactions on Computational Biology and Bioinformatics
Alexey Markin, Oliver Eulenstein
Median tree inference under path-difference metrics has shown great promise for large-scale phylogeny estimation. Similar to these metrics is the family of cophenetic metrics that originates from a classic dendrogram comparison method introduced more than 50 years ago. Despite the appeal of this family of metrics, the problem of computing median trees under cophenetic metrics has not been analyzed. Like other standard median tree problems relevant in practice, as we show here, this problem is also NP-hard. NP-hard median tree problems have been successfully addressed by local search heuristics that are solving thousands of instances of a corresponding (local neighborhood) search problem...
September 13, 2018: IEEE/ACM Transactions on Computational Biology and Bioinformatics
Rajan Kapoor, Aniruddha Datta, Chao Sima, Jianping Hua, Rosana Lopes, Michael L Bittner
In this work, we develop a systematic approach for applying pathway knowledge to a multivariate Gaussian mixture model for dissecting a heterogeneous cancer tissue. The downstream transcription factors are selected as observables from available partial pathway knowledge in such a way that the subpopulations produce some differential behavior in response to the drugs selected in the upstream. For each subpopulation, each unique (drug, observable) pair is considered as a unique dimension of a multivariate Gaussian distribution...
September 12, 2018: IEEE/ACM Transactions on Computational Biology and Bioinformatics
Hossein Yazdani, Aazam Yazdani, Len Cheng, David Christiani
Learning methods, such as conventional clustering and classification, have been applied in diagnosing diseases to categorize samples based on their features. Going beyond clustering samples, membership degrees represent to what degree each sample belongs to a cluster. Variation of membership degrees in each cluster provides information about the cluster as a whole and each sample individually which enables to have insights toward precision medicine. Membership degrees are measured more accurately through removing restrictions from clustering samples...
September 12, 2018: IEEE/ACM Transactions on Computational Biology and Bioinformatics
Fatima Zare, Sardar Ansari, Kayvan Najarian, Sheida Nabavi
Copy number variation (CNV) is a type of genomic/genetic variation that plays an important role in phenotypic diversity, evolution, and disease susceptibility. Next generation sequencing (NGS) technologies have created an opportunity for more accurate detection of CNVs with higher resolution. However, efficient and precise detection of CNVs remains challenging due to high levels of noise and biases, data heterogeneity and the "big data" nature of NGS data. Sequence coverage (readcount) data are mostly used for detecting CNVs, specially for whole exome sequencing data...
September 12, 2018: IEEE/ACM Transactions on Computational Biology and Bioinformatics
Haifen Chen, Devamuni A K Maduranga, Piyushkumar Mundra, Jie Zheng
Accurately reconstructing gene regulatory networks (GRNs) from high-throughput gene expression data has been a major challenge in systems biology for decades. Many approaches have been proposed to solve this problem. However, there is still much room for the improvement of GRN inference. Integrating data from different sources is a promising strategy. Epigenetic modifications have a close relationship with gene regulation. Hence, epigenetic data such as histone modification profiles can provide useful information for uncovering regulatory interactions between genes...
September 10, 2018: IEEE/ACM Transactions on Computational Biology and Bioinformatics
Paul Fergus, Aday Montanez, Basma Abdulaimma, Paulo Lisboa, Carl Chalmers, Beth Pineles
Genome-Wide Association Studies (GWAS) are used to identify statistically significant genetic variants in case-control studies. The main objective is to find single nucleotide polymorphisms (SNPs) that influence a particular phenotype. GWAS use a p-value threshold of $5\star 10^{-8}$ to identify highly ranked SNPs. While this approach has proven useful for detecting disease-susceptible SNPs, evidence has shown that many of these are, in fact, false positives. Consequently, there is some ambiguity about the most suitable threshold for claiming genome-wide significance...
September 3, 2018: IEEE/ACM Transactions on Computational Biology and Bioinformatics
Saurav Mallik, Sanghamitra Bandyopadhyay
The identification of modules (groups of several tightly interconnected genes) in gene interaction network is an essential task for better understanding of the architecture of the whole network. In this article, we develop a novel weighted connectivity measure integrating co-methylation, co-expression and protein-protein interactions (called WeCoMXP) to detect gene-modules for multi-omics dataset. The proposed measure goes beyond the fundamental degree centrality measure through considering some formulation of higher-order connections...
September 3, 2018: IEEE/ACM Transactions on Computational Biology and Bioinformatics
Lishuang Li, Yuxin Jiang
Biomedical named entity recognition (Bio-NER) is an important preliminary step for many biomedical text mining tasks. The current mainstream methods for NER are based on the neural networks to avoid the complex hand-designed features derived from various linguistic analyses. However, these methods ignore some potential sentence-level semantic information and general features of semantic and syntactic. Therefore, we propose a novel Long Short Term Memory (LSTM) Networks model integrating language model and sentence-level reading control gate (LS-BLSTM-CRF) for Bio-NER...
September 3, 2018: IEEE/ACM Transactions on Computational Biology and Bioinformatics
Peng Wu, Dong Wang
Applications that classify DNA microarray expression data are helpful for diagnosing cancer. Many attempts have been made to analyze these data; however, new methods are needed to obtain better results. In this study, a Complex Network (CN) classifier was exploited to implement the classification task. An algorithm was used to initialize the structure, which allowed input variables to be selected over layered connections and different activation functions for different nodes. Then, a hybrid method integrated the Genetic Programming and the Particle Swarm Optimization algorithms was used to identify an optimal structure with the parameters encoded in the classifier...
September 3, 2018: IEEE/ACM Transactions on Computational Biology and Bioinformatics
Jianguo Chen, Kenli Li, Kashif Bilal, Ahmed A Metwally, Keqin Li, Philip Yu
Protein interactions constitute the fundamental building block of almost every life activity. Identifying protein communities from Protein-Protein Interaction (PPI) networks is essential to understand the principles of cellular organization and explore the causes of various diseases. It is critical to integrate multiple data resources to identify reliable protein communities that have biological significance and improve the performance of community detection methods for large-scale PPI networks. In this paper, we propose a Multi-source Learning based Protein Community Detection (MLPCD) algorithm by integrating Gene Expression Data (GED) and a parallel solution of MLPCD using cloud computing technology...
August 31, 2018: IEEE/ACM Transactions on Computational Biology and Bioinformatics
Lishuang Li, Yang Liu, Meiyue Qin
Biomedical event extraction is important for medical research and disease prevention, which has attracted much attention in recent years. Traditionally, most of the state-of-the-art systems have been based on shallow machine learning methods, which require many complex, hand-designed features. In addition, the words encoded by one-hot are unable to represent semantic information. Therefore, we utilize dependency-based embeddings to represent words semantically and syntactically. Then, we propose a parallel multi-pooling convolutional neural network (PMCNN) model to capture the compositional semantic features of sentences...
August 31, 2018: IEEE/ACM Transactions on Computational Biology and Bioinformatics
Zhen Cao, Shihua Zhang
Gapped k-mers frequency vectors (gkm-fv) has been presented for extracting sequence features. Coupled with support vector machine (gkm-SVM), gkm-fvs have been used to achieve effective sequence-based predictions. However, the huge computation of a large kernel matrix prevents it from using large amount of data. And it is unclear how to combine gkm-fvs with other data sources in the context of string kernel. On the other hand, the high dimensionality, colinearity and sparsity of gkm-fvs hinder the use of many traditional machine learning methods without a kernel trick...
August 31, 2018: IEEE/ACM Transactions on Computational Biology and Bioinformatics
Efrain Pinzon, Paola Rondon-Villarreal, William A Alvarez, Hernan Guillermo Hernandez
Polymerase Chain Reaction (PCR) based techniques for DNA methylation techniques includes MS-HRM technique. Methylation Sensitive High-Resolution Melting (MS-HRM) primer-design requires a set of necessary recommendations for such DNA methylation assessment. However, there were not any available software that allows an automatic design of this kind primers. We present Softepigen, the first complete MS-HRM primer design software. Softepigen allows to search for primers in a genomic region following Wojdacz's recommendations and targets primer binding regions with high linguistic complexity sequences that increase the specificity of the converted sequence of the human genome...
August 29, 2018: IEEE/ACM Transactions on Computational Biology and Bioinformatics
Shengping Yang, Jianrong Wu
Many current RNA-sequencing data analysis methods compare expressions one gene at a time, taking little consideration of the correlations among genes. In this study, we propose a method to convert such one-dimensional comparison approaches into a two-dimensional evaluation of the ratio of standard deviations of two constructed random variables. This method allows the identification of differentially expressed genes while controlling a preset significance level conditional on the read count mean-variance relationship...
August 27, 2018: IEEE/ACM Transactions on Computational Biology and Bioinformatics
Lin Yuan, Le-Hang Guo, Chang-An Yuan, You-Hua Zhang, Kyungsook Han, Asoke Nandi, Barry Honig, De-Shuang Huang
Underlying a cancer phenotype is a specific gene regulatory network that represents the complex regulatory relationships between genes. However, it remains a challenge to find cancer-related gene regulatory network because of insufficient sample sizes and complex regulatory mechanisms in which gene is influenced by not only other genes but also other biological factors. With the development of high-throughput technologies and the unprecedented wealth of multi-omics data give us a new opportunity to design machine learning method to investigate underlying gene regulatory network...
August 23, 2018: IEEE/ACM Transactions on Computational Biology and Bioinformatics
Xiujuan Lei, Xiaoqin Yang, Fangxiang Wu
It is well known that essential proteins play an extremely important role in controlling cellular activities in living organisms. Identifying essential proteins from protein protein interaction (PPI) networks is conducive to the understanding of cellular functions and molecular mechanisms. Hitherto, many essential proteins detection methods have been proposed. Nevertheless, those existing identification methods are not satisfactory because of low efficiency and low sensitivity to noisy data. This paper presents a novel computational approach based on artificial fish swarm optimization for essential proteins prediction in PPI networks (called AFSO_EP)...
August 15, 2018: IEEE/ACM Transactions on Computational Biology and Bioinformatics
Rongrong Zhang, Ming Hu, Yu Michael Zhu, Zhaohui Steve Qin, Ke Deng, Jun Liu
The recently developed Hi-C technology enables a genome-wide view of chromosome spatial organizations, and has shed deep insights into genome structure and genome function. However, multiple sources of uncertainties make downstream data analysis and interpretation challenging. Specifically, statistical models for inferring three-dimensional (3D) chromosomal structure from Hi-C data are far from their maturity. Most existing methods are highly over-parameterized, lacking clear interpretations, and sensitive to outliers...
August 15, 2018: IEEE/ACM Transactions on Computational Biology and Bioinformatics
Ali Karimnezhad, David R Bickel
In a genome-wide association study (GWAS), the probability that a single nucleotide polymorphism (SNP) is not associated with a disease is its local false discovery rate (LFDR). The LFDR for each SNP is relative to a reference class of SNPs. For example, the LFDR of an exonic SNP can vary widely depending on whether it is considered relative to the separate reference class of other exonic SNPs or relative to the combined reference class of all SNPs in the data set. As a result, the analysis of the data based on the combined reference class might indicate that a specific exonic SNP is associated with the disease, while using the separate reference class indicates that it is not associated, or vice versa...
August 14, 2018: IEEE/ACM Transactions on Computational Biology and Bioinformatics
Kumar Saurabh Singh, Katherine Beadle, Bartek J Troczka, Linda Field, Emyr Davies, Martin Williamson, Ralf Nauen, Chris Bass
Many non-model organisms lack reference genomes and the sequencing and de novo assembly of an organism's transcriptome is an affordable means by which to characterize the coding component of its genome. Despite the advances that have made this possible, assembling a transcriptome without a known reference usually results in a collection of full-length and partial gene transcripts. The downstream analysis of genes represented as partial transcripts then often requires further experimental work in the laboratory in order to obtain full- length sequences...
August 13, 2018: IEEE/ACM Transactions on Computational Biology and Bioinformatics
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"