keyword
MENU ▼
Read by QxMD icon Read
search

bioinformatics using machine learning

keyword
https://www.readbyqxmd.com/read/28499419/across-proteome-modeling-of-dimer-structures-for-the-bottom-up-assembly-of-protein-protein-interaction-networks
#1
Surabhi Maheshwari, Michal Brylinski
BACKGROUND: Deciphering complete networks of interactions between proteins is the key to comprehend cellular regulatory mechanisms. A significant effort has been devoted to expanding the coverage of the proteome-wide interaction space at molecular level. Although a growing body of research shows that protein docking can, in principle, be used to predict biologically relevant interactions, the accuracy of the across-proteome identification of interacting partners and the selection of near-native complex structures still need to be improved...
May 12, 2017: BMC Bioinformatics
https://www.readbyqxmd.com/read/28499008/rippminer-a-bioinformatics-resource-for-deciphering-chemical-structures-of-ripps-based-on-prediction-of-cleavage-and-cross-links
#2
Priyesh Agrawal, Shradha Khater, Money Gupta, Neetu Sain, Debasisa Mohanty
Ribosomally synthesized and post-translationally modified peptides (RiPPs) constitute a rapidly growing class of natural products with diverse structures and bioactivities. We have developed RiPPMiner, a novel bioinformatics resource for deciphering chemical structures of RiPPs by genome mining. RiPPMiner derives its predictive power from machine learning based classifiers, trained using a well curated database of more than 500 experimentally characterized RiPPs. RiPPMiner uses Support Vector Machine to distinguish RiPP precursors from other small proteins and classify the precursors into 12 sub-classes of RiPPs...
May 12, 2017: Nucleic Acids Research
https://www.readbyqxmd.com/read/28476106/geminivirus-data-warehouse-a-database-enriched-with-machine-learning-approaches
#3
Jose Cleydson F Silva, Thales F M Carvalho, Marcos F Basso, Michihito Deguchi, Welison A Pereira, Roberto R Sobrinho, Pedro M P Vidigal, Otávio J B Brustolini, Fabyano F Silva, Maximiller Dal-Bianco, Renildes L F Fontes, Anésia A Santos, Francisco Murilo Zerbini, Fabio R Cerqueira, Elizabeth P B Fontes
BACKGROUND: The Geminiviridae family encompasses a group of single-stranded DNA viruses with twinned and quasi-isometric virions, which infect a wide range of dicotyledonous and monocotyledonous plants and are responsible for significant economic losses worldwide. Geminiviruses are divided into nine genera, according to their insect vector, host range, genome organization, and phylogeny reconstruction. Using rolling-circle amplification approaches along with high-throughput sequencing technologies, thousands of full-length geminivirus and satellite genome sequences were amplified and have become available in public databases...
May 5, 2017: BMC Bioinformatics
https://www.readbyqxmd.com/read/28472232/privacy-preserving-evaporative-cooling-feature-selection-and-classification-with-relief-f-and-random-forests
#4
Trang T Le, W Kyle Simmons, Masaya Misaki, Jerzy Bodurka, Bill C White, Jonathan Savitz, Brett A McKinney
Motivation: Classification of individuals into disease or clinical categories from high-dimensional biological data with low prediction error is an important challenge of statistical learning in bioinformatics. Feature selection can improve classification accuracy but must be incorporated carefully into cross-validation to avoid overfitting. Recently, feature selection methods based on differential privacy, such as differentially private random forests and reusable holdout sets, have been proposed...
May 4, 2017: Bioinformatics
https://www.readbyqxmd.com/read/28469415/current-developments-in-machine-learning-techniques-in-biological-data-mining
#5
EDITORIAL
Gerard G Dumancas, Indra Adrianto, Ghalib Bello, Mikhail Dozmorov
This supplement is intended to focus on the use of machine learning techniques to generate meaningful information on biological data. This supplement under Bioinformatics and Biology Insights aims to provide scientists and researchers working in this rapid and evolving field with online, open-access articles authored by leading international experts in this field. Advances in the field of biology have generated massive opportunities to allow the implementation of modern computational and statistical techniques...
2017: Bioinformatics and Biology Insights
https://www.readbyqxmd.com/read/28449114/neuro-symbolic-representation-learning-on-biological-knowledge-graphs
#6
Mona Alshahrani, Mohammed Asif Khan, Omar Maddouri, Akira R Kinjo, Núria Queralt-Rosinach, Robert Hoehndorf
Motivation: Biological data and knowledge bases increasingly rely on Semantic Web technologies and the use of knowledge graphs for data integration, retrieval and federated queries. In the past years, feature learning methods that are applicable to graph-structured data are becoming available, but have not yet widely been applied and evaluated on structured biological knowledge. Results: We develop a novel method for feature learning on biological knowledge graphs...
April 25, 2017: Bioinformatics
https://www.readbyqxmd.com/read/28444127/hla-class-i-binding-prediction-via-convolutional-neural-networks
#7
Yeeleng S Vang, Xiaohui Xie
Motivation: Many biological processes are governed by protein-ligand interactions. One such example is the recognition of self and nonself cells by the immune system. This immune response process is regulated by the major histocompatibility complex (MHC) protein which is encoded by the human leukocyte antigen (HLA) complex. Understanding the binding potential between MHC and peptides can lead to the design of more potent, peptide-based vaccines and immunotherapies for infectious autoimmune diseases...
April 21, 2017: Bioinformatics
https://www.readbyqxmd.com/read/28430949/capturing-non-local-interactions-by-long-short-term-memory-bidirectional-recurrent-neural-networks-for-improving-prediction-of-protein-secondary-structure-backbone-angles-contact-numbers-and-solvent-accessibility
#8
Rhys Heffernan, Yuedong Yang, Kuldip Paliwal, Yaoqi Zhou
Motivation: The accuracy of predicting protein local and global structural properties such as secondary structure and solvent accessible surface area has been stagnant for many years because of the challenge of accounting for non-local interactions between amino acid residues that are close in three-dimensional structural space but far from each other in their sequence positions. All existing machine-learning techniques relied on a sliding window of 10-20 amino acid residues to capture some "short to intermediate" non-local interactions...
April 18, 2017: Bioinformatics
https://www.readbyqxmd.com/read/28398465/machine-learning-in-computational-biology-to-accelerate-high-throughput-protein-expression
#9
Anand Sastry, Jonathan Monk, Hanna Tegel, Mathias Uhlén, Bernhard O Palsson, Johan Rockberg, Elizabeth Brunk
Motivation: The Human Protein Atlas (HPA) enables the simultaneous characterization of thousands of proteins across various tissues to pinpoint their spatial location in the human body. This has been achieved through transcriptomics and high-throughput immunohistochemistry-based approaches, where over 40,000 unique human protein fragments have been expressed in E. coli. These datasets enable quantitative tracking of entire cellular proteomes and present new avenues for understanding molecularlevel properties influencing expression and solubility...
April 7, 2017: Bioinformatics
https://www.readbyqxmd.com/read/28391206/multiple-swarm-ensembles-improving-the-predictive-power-and-robustness-of-predictive-models-and-its-use-in-computational-biology
#10
Pedro Alves, Shuang Liu, Daifeng Wang, Mark Gerstein
Machine learning is an integral part of computational biology, and has already shown its use in various applications, such as prognostic tests. In the last few years in the non-biological machine learning community, ensembling techniques have shown their power in data mining competitions such as the Netflix challenge; however, such methods have not found wide use in computational biology. In this work we endeavor to show how ensembling techniques can be applied to practical problems, including problems in the field of bioinformatics, and how they often outperform other machine learning techniques in both predictive power and robustness...
April 5, 2017: IEEE/ACM Transactions on Computational Biology and Bioinformatics
https://www.readbyqxmd.com/read/28361684/nearender-an-r-package-for-functional-interpretation-of-omics-data-via-network-enrichment-analysis
#11
Ashwini Jeggari, Andrey Alexeyenko
BACKGROUND: The statistical evaluation of pathway enrichment, i.e. of gene profiles' confluence to the pathway level, allows exploring molecular landscapes using functionally annotated gene sets. However, pathway scores can also be used as predictive features in machine learning. That requires, firstly, increasing statistical power and biological relevance via a network enrichment analysis (NEA) and, secondly, a fast and convenient procedure for rendering the original data into a space of pathway scores...
March 23, 2017: BMC Bioinformatics
https://www.readbyqxmd.com/read/28351701/extracting-features-from-protein-sequences-to-improve-deep-extreme-learning-machine-for-protein-fold-recognition
#12
Wisam Ibrahim, Mohammad Saniee Abadeh
Protein fold recognition is an important problem in bioinformatics to predict three-dimensional structure of a protein. One of the most challenging tasks in protein fold recognition problem is the extraction of efficient features from the amino-acid sequences to obtain better classifiers. In this paper, we have proposed six descriptors to extract features from protein sequences. These descriptors are applied in the first stage of a three-stage framework PCA-DELM-LDA to extract feature vectors from the amino-acid sequences...
March 27, 2017: Journal of Theoretical Biology
https://www.readbyqxmd.com/read/28341746/leveraging-sequence-based-faecal-microbial-community-survey-data-to-identify-a-composite-biomarker-for-colorectal-cancer
#13
Manasi S Shah, Todd Z DeSantis, Thomas Weinmaier, Paul J McMurdie, Julia L Cope, Adam Altrichter, Jose-Miguel Yamal, Emily B Hollister
OBJECTIVE: Colorectal cancer (CRC) is the second leading cause of cancer-associated mortality in the USA. The faecal microbiome may provide non-invasive biomarkers of CRC and indicate transition in the adenoma-carcinoma sequence. Re-analysing raw sequence and metadata from several studies uniformly, we sought to identify a composite and generalisable microbial marker for CRC. DESIGN: Raw 16S rRNA gene sequence data sets from nine studies were processed with two pipelines, (1) QIIME closed reference (QIIME-CR) or (2) a strain-specific method herein termed SS-UP (Strain Select, UPARSE bioinformatics pipeline)...
March 24, 2017: Gut
https://www.readbyqxmd.com/read/28335739/identification-of-long-non-coding-transcripts-with-feature-selection-a-comparative-study
#14
Giovanna M M Ventola, Teresa M R Noviello, Salvatore D'Aniello, Antonietta Spagnuolo, Michele Ceccarelli, Luigi Cerulo
BACKGROUND: The unveiling of long non-coding RNAs as important gene regulators in many biological contexts has increased the demand for efficient and robust computational methods to identify novel long non-coding RNAs from transcripts assembled with high throughput RNA-seq data. Several classes of sequence-based features have been proposed to distinguish between coding and non-coding transcripts. Among them, open reading frame, conservation scores, nucleotide arrangements, and RNA secondary structure have been used with success in literature to recognize intergenic long non-coding RNAs, a particular subclass of non-coding RNAs...
March 23, 2017: BMC Bioinformatics
https://www.readbyqxmd.com/read/28315224/an-overview-of-bioinformatics-tools-and-resources-in-allergy
#15
Zhiyan Fu, Jing Lin
The rapidly increasing number of characterized allergens has created huge demands for advanced information storage, retrieval, and analysis. Bioinformatics and machine learning approaches provide useful tools for the study of allergens and epitopes prediction, which greatly complement traditional laboratory techniques. The specific applications mainly include identification of B- and T-cell epitopes, and assessment of allergenicity and cross-reactivity. In order to facilitate the work of clinical and basic researchers who are not familiar with bioinformatics, we review in this chapter the most important databases, bioinformatic tools, and methods with relevance to the study of allergens...
2017: Methods in Molecular Biology
https://www.readbyqxmd.com/read/28296577/genome-wide-identification-and-characterization-of-small-rnas-in-rhodobacter-capsulatus-and-identification-of-small-rnas-affected-by-loss-of-the-response-regulator-ctra
#16
Marc P Grüll, Lourdes Peña-Castillo, Martin E Mulligan, Andrew S Lang
Small non-coding RNAs (sRNAs) are involved in the control of numerous cellular processes through various regulatory mechanisms, and in the past decade many studies have identified sRNAs in a multitude of bacterial species using RNA sequencing (RNA-seq). Here, we present the first genome-wide analysis of sRNA sequencing data in Rhodobacter capsulatus, a purple nonsulfur photosynthetic alphaproteobacterium. Using a recently developed bioinformatics approach, sRNA-Detect, we detected 422 putative sRNAs from R...
March 15, 2017: RNA Biology
https://www.readbyqxmd.com/read/28198674/sequence-specific-bias-correction-for-rna-seq-data-using-recurrent-neural-networks
#17
Yao-Zhong Zhang, Rui Yamaguchi, Seiya Imoto, Satoru Miyano
BACKGROUND: The recent success of deep learning techniques in machine learning and artificial intelligence has stimulated a great deal of interest among bioinformaticians, who now wish to bring the power of deep learning to bare on a host of bioinformatical problems. Deep learning is ideally suited for biological problems that require automatic or hierarchical feature representation for biological data when prior knowledge is limited. In this work, we address the sequence-specific bias correction problem for RNA-seq data redusing Recurrent Neural Networks (RNNs) to model nucleotide sequences without pre-determining sequence structures...
January 25, 2017: BMC Genomics
https://www.readbyqxmd.com/read/28179914/cell-cycle-and-cell-size-dependent-gene-expression-reveals-distinct-subpopulations-at-single-cell-level
#18
Soheila Dolatabadi, Julián Candia, Nina Akrap, Christoffer Vannas, Tajana Tesan Tomic, Wolfgang Losert, Göran Landberg, Pierre Åman, Anders Ståhlberg
Cell proliferation includes a series of events that is tightly regulated by several checkpoints and layers of control mechanisms. Most studies have been performed on large cell populations, but detailed understanding of cell dynamics and heterogeneity requires single-cell analysis. Here, we used quantitative real-time PCR, profiling the expression of 93 genes in single-cells from three different cell lines. Individual unsynchronized cells from three different cell lines were collected in different cell cycle phases (G0/G1 - S - G2/M) with variable cell sizes...
2017: Frontiers in Genetics
https://www.readbyqxmd.com/read/28157153/enhancing-the-biological-relevance-of-machine-learning-classifiers-for-reverse-vaccinology
#19
Ashley I Heinson, Yawwani Gunawardana, Bastiaan Moesker, Carmen C Denman Hume, Elena Vataga, Yper Hall, Elena Stylianou, Helen McShane, Ann Williams, Mahesan Niranjan, Christopher H Woelk
Reverse vaccinology (RV) is a bioinformatics approach that can predict antigens with protective potential from the protein coding genomes of bacterial pathogens for subunit vaccine design. RV has become firmly established following the development of the BEXSERO® vaccine against Neisseria meningitidis serogroup B. RV studies have begun to incorporate machine learning (ML) techniques to distinguish bacterial protective antigens (BPAs) from non-BPAs. This research contributes significantly to the RV field by using permutation analysis to demonstrate that a signal for protective antigens can be curated from published data...
February 1, 2017: International Journal of Molecular Sciences
https://www.readbyqxmd.com/read/28137713/hipred-an-integrative-approach-to-predicting-haploinsufficient-genes
#20
Hashem A Shihab, Mark F Rogers, Colin Campbell, Tom R Gaunt
MOTIVATION: A major cause of autosomal dominant disease is haploinsufficiency, whereby a single copy of a gene is not sufficient to maintain the normal function of the gene. A large proportion of existing methods for predicting haploinsufficiency incorporate biological networks, e.g. protein-protein interaction networks, that have recently been shown to introduce study bias. As a result, these methods tend to perform best on well studied genes, but underperform on less studied genes. The advent of large genome sequencing consortia, such as the 1,000 genomes project, NHLBI Exome Sequencing Project (ESP) and the Exome Aggregation Consortium (ExAC) creates an urgent need for unbiased haploinsufficiency prediction methods...
January 30, 2017: Bioinformatics
keyword
keyword
45747
1
2
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read
×

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"