Read by QxMD icon Read

machine learning and snp-snp

Suneetha Uppu, Aneesh Krishna, Raj Gopalan
In this era of genome-wide association studies (GWAS), the quest for understanding the genetic architecture of complex diseases is rapidly increasing more than ever before. The development of high throughput genotyping and next generation sequencing technologies enables genetic epidemiological analysis of large scale data. These advances have led to the identification of a number of single nucleotide polymorphisms (SNPs) responsible for disease susceptibility. The interactions between SNPs associated with complex diseases are increasingly being explored in the current literature...
December 2, 2016: IEEE/ACM Transactions on Computational Biology and Bioinformatics
J Oh, S Kerns, H Ostrer, B Rosenstein, J Deasy
PURPOSE: We investigated whether integration of machine learning and bioinformatics techniques on genome-wide association study (GWAS) data can improve the performance of predictive models in predicting the risk of developing radiation-induced late rectal bleeding and erectile dysfunction in prostate cancer patients. METHODS: We analyzed a GWAS dataset generated from 385 prostate cancer patients treated with radiotherapy. Using genotype information from these patients, we designed a machine learning-based predictive model of late radiation-induced toxicities: rectal bleeding and erectile dysfunction...
June 2016: Medical Physics
Samantha Cheng, Angeline S Andrew, Peter C Andrews, Jason H Moore
BACKGROUND: Bladder cancer is common disease with a complex etiology that is likely due to many different genetic and environmental factors. The goal of this study was to embrace this complexity using a bioinformatics analysis pipeline designed to use machine learning to measure synergistic interactions between single nucleotide polymorphisms (SNPs) in two genome-wide association studies (GWAS) and then to assess their enrichment within functional groups defined by Gene Ontology. The significance of the results was evaluated using permutation testing and those results that replicated between the two GWAS data sets were reported...
2016: BioData Mining
Suneetha Uppu, Aneesh Krishna, Raj Gopalan
In this era of genome-wide association studies (GWAS), the quest for understanding the genetic architecture of complex diseases is rapidly increasing more than ever before. The development of high throughput genotyping and next generation sequencing technologies enables genetic epidemiological analysis of large scale data. These advances have led to the identification of a number of single nucleotide polymorphisms (SNPs) responsible for disease susceptibility. The interactions between SNPs associated with complex diseases are increasingly being explored in the current literature...
December 2, 2016: IEEE/ACM Transactions on Computational Biology and Bioinformatics
Takamitsu Watanabe, Takeshi Otowa, Osamu Abe, Hitoshi Kuwabara, Yuta Aoki, Tatsunobu Natsubori, Hidemasa Takao, Chihiro Kakiuchi, Kenji Kondo, Masashi Ikeda, Nakao Iwata, Kiyoto Kasai, Tsukasa Sasaki, Hidenori Yamasue
Oxytocin appears beneficial for autism spectrum disorder (ASD), and more than 20 single-nucleotide polymorphisms (SNPs) in oxytocin receptor (OXTR) are relevant to ASD. However, neither biological functions of OXTR SNPs in ASD nor critical OXTR SNPs that determine oxytocin's effects on ASD remains known. Here, using a machine-learning algorithm that was designed to evaluate collective effects of multiple SNPs and automatically identify most informative SNPs, we examined relationships between 27 representative OXTR SNPs and six types of behavioral/neural response to oxytocin in ASD individuals...
October 19, 2016: Social Cognitive and Affective Neuroscience
Robert J MacInnis, Daniel F Schmidt, Enes Makalic, Gianluca Severi, Liesel M FitzGerald, Matthias Reumann, Miroslaw K Kapuscinski, Adam Kowalczyk, Zeyu Zhou, Benjamin Goudey, Guoqi Qian, Quang M Bui, Daniel J Park, Adam Freeman, Melissa C Southey, Ali Amin Al Olama, Zsofia Kote-Jarai, Rosalind A Eeles, John L Hopper, Graham G Giles
BACKGROUND: We have developed a genome-wide association study analysis method called DEPTH (DEPendency of association on the number of Top Hits) to identify genomic regions potentially associated with disease by considering overlapping groups of contiguous markers (e.g., SNPs) across the genome. DEPTH is a machine learning algorithm for feature ranking of ultra-high dimensional datasets, built from well-established statistical tools such as bootstrapping, penalized regression, and decision trees...
December 2016: Cancer Epidemiology, Biomarkers & Prevention
Qianchuan He, Tianxi Cai, Yang Liu, Ni Zhao, Quaker E Harmon, Lynn M Almli, Elisabeth B Binder, Stephanie M Engel, Kerry J Ressler, Karen N Conneely, Xihong Lin, Michael C Wu
Kernel machine learning methods, such as the SNP-set kernel association test (SKAT), have been widely used to test associations between traits and genetic polymorphisms. In contrast to traditional single-SNP analysis methods, these methods are designed to examine the joint effect of a set of related SNPs (such as a group of SNPs within a gene or a pathway) and are able to identify sets of SNPs that are associated with the trait of interest. However, as with many multi-SNP testing approaches, kernel machine testing can draw conclusion only at the SNP-set level, and does not directly inform on which one(s) of the identified SNP set is actually driving the associations...
December 2016: Genetic Epidemiology
Robersy Sanchez, Sally A Mackenzie
Cytosine DNA methylation (CDM) is a highly abundant, heritable but reversible chemical modification to the genome. Herein, a machine learning approach was applied to analyze the accumulation of epigenetic marks in methylomes of 152 ecotypes and 85 silencing mutants of Arabidopsis thaliana. In an information-thermodynamics framework, two measurements were used: (1) the amount of information gained/lost with the CDM changes I R and (2) the uncertainty of not observing a SNP L C R . We hypothesize that epigenetic marks are chromosomal footprints accounting for different ontogenetic and phylogenetic histories of individual populations...
June 17, 2016: International Journal of Molecular Sciences
A Dix, S Vlaic, R Guthke, J Linde
In systems biology, researchers aim to understand complex biological systems as a whole, which is often achieved by mathematical modelling and the analyses of high-throughput data. In this review, we give an overview of medical applications of systems biology approaches with special focus on host-pathogen interactions. After introducing general ideas of systems biology, we focus on (1) the detection of putative biomarkers for improved diagnosis and support of therapeutic decisions, (2) network modelling for the identification of regulatory interactions between cellular molecules to reveal putative drug targets and (3) module discovery for the detection of phenotype-specific modules in molecular interaction networks...
July 2016: Clinical Microbiology and Infection
Isaiah Tolo, Jonathan C Thomas, Rebecca S B Fischer, Eric L Brown, Barry M Gray, D Ashley Robinson
Staphylococcus epidermidis is a ubiquitous colonizer of human skin and a common cause of medical device-associated infections. The extent to which the population genetic structure of S. epidermidis distinguishes commensal from pathogenic isolates is unclear. Previously, Bayesian clustering of 437 multilocus sequence types (STs) in the international database revealed a population structure of six genetic clusters (GCs) that may reflect the species' ecology. Here, we first verified the presence of six GCs, including two (GC3 and GC5) with significant admixture, in an updated database of 578 STs...
July 2016: Journal of Clinical Microbiology
M M Judge, J F Kearney, M C McClure, R D Sleator, D P Berry
The objective of this study was to develop, using alternative algorithms, low-density SNP genotyping panels (384 to 12,000 SNP), which can be accurately imputed to higher-density panels across independent cattle populations. Single nucleotide polymorphisms were selected based on genomic characteristics (i.e., linkage disequilibrium [LD], minor allele frequency [MAF], and genomic distance) in a population of 1,267 Holstein-Friesian animals genotyped on the Illumina Bovine50 Beadchip (54,001 SNP). Single nucleotide polymorphism selection methods included 1) random; 2) equidistant location; 3) combination of SNP MAF and LD structure while maintaining relatively equal genomic distance between adjacent SNP; 4) a combination of high MAF, genomic distance between selected and candidate SNP, and correlation between genotypes of selected and candidate SNP; and 5) a machine learning algorithm...
March 2016: Journal of Animal Science
Jing Li, James D Malley, Angeline S Andrew, Margaret R Karagas, Jason H Moore
BACKGROUND: Identifying gene-gene interactions is essential to understand disease susceptibility and to detect genetic architectures underlying complex diseases. Here, we aimed at developing a permutation-based methodology relying on a machine learning method, random forest (RF), to detect gene-gene interactions. Our approach called permuted random forest (pRF) which identified the top interacting single nucleotide polymorphism (SNP) pairs by estimating how much the power of a random forest classification model is influenced by removing pairwise interactions...
2016: BioData Mining
Andrew Dahl, Valentina Iotchkova, Amelie Baud, Åsa Johansson, Ulf Gyllensten, Nicole Soranzo, Richard Mott, Andreas Kranis, Jonathan Marchini
Genetic association studies have yielded a wealth of biological discoveries. However, these studies have mostly analyzed one trait and one SNP at a time, thus failing to capture the underlying complexity of the data sets. Joint genotype-phenotype analyses of complex, high-dimensional data sets represent an important way to move beyond simple genome-wide association studies (GWAS) with great potential. The move to high-dimensional phenotypes will raise many new statistical problems. Here we address the central issue of missing phenotypes in studies with any level of relatedness between samples...
April 2016: Nature Genetics
Joanna Peloquin, Gautam Goel, Hailiang Huang, Talin Haritunians, Ryan Sartor, Mark Daly, Rodney Newberry, Dermot McGovern, Sergio Lira, Ramnik Xavier
BACKGROUND: Genome-wide association studies have linked single nucleotide polymorphisms (SNPs) to risk of inflammatory bowel disease (IBD). Yet, the majority of IBD-associated risk SNPs tag non-coding regions of the genome, with more than 1000 genes encoded within the risk loci. In addition to ongoing fine mapping of risk loci and exome sequencing studies, the study of gene expression and characterization of expression quantitative trait loci (eQTL) help to refine candidate genes in risk loci...
March 2016: Inflammatory Bowel Diseases
Silke Szymczak, Emily Holzinger, Abhijit Dasgupta, James D Malley, Anne M Molloy, James L Mills, Lawrence C Brody, Dwight Stambolian, Joan E Bailey-Wilson
BACKGROUND: Machine learning methods and in particular random forests (RFs) are a promising alternative to standard single SNP analyses in genome-wide association studies (GWAS). RFs provide variable importance measures (VIMs) to rank SNPs according to their predictive power. However, in contrast to the established genome-wide significance threshold, no clear criteria exist to determine how many SNPs should be selected for downstream analyses. RESULTS: We propose a new variable selection approach, recurrent relative variable importance measure (r2VIM)...
2016: BioData Mining
Xiaoyong Pan, Kai Xiong
Recently circular RNA (circularRNA) has been discovered as an increasingly important type of long non-coding RNA (lncRNA), playing an important role in gene regulation, such as functioning as miRNA sponges. So it is very promising to identify circularRNA transcripts from de novo assembled transcripts obtained by high-throughput sequencing, such as RNA-seq data. In this study, we presented a machine learning approach, named as PredcircRNA, focused on distinguishing circularRNA from other lncRNAs using multiple kernel learning...
August 2015: Molecular BioSystems
Fei Lu, Maria C Romay, Jeffrey C Glaubitz, Peter J Bradbury, Robert J Elshire, Tianyu Wang, Yu Li, Yongxiang Li, Kassa Semagn, Xuecai Zhang, Alvaro G Hernandez, Mark A Mikel, Ilya Soifer, Omer Barad, Edward S Buckler
In addition to single-nucleotide polymorphisms, structural variation is abundant in many plant genomes. The structural variation across a species can be represented by a 'pan-genome', which is essential to fully understand the genetic control of phenotypes. However, the pan-genome's complexity hinders its accurate assembly via sequence alignment. Here we demonstrate an approach to facilitate pan-genome construction in maize. By performing 18 trillion association tests we map 26 million tags generated by reduced representation sequencing of 14,129 maize inbred lines...
2015: Nature Communications
Li Li, Yi Xiong, Zhuo-Yu Zhang, Quan Guo, Qin Xu, Hien-Haw Liow, Yong-Hong Zhang, Dong-Qing Wei
Single nucleotide polymorphisms (SNPs) make up the most common form of mutations in human cytochrome P450 enzymes family, and have the potential to bring with different drug responses or specific diseases in individual patients. Here, based on machine learning technology, we aim to explore an effective set of sequence-based features for improving prediction of SNPs by using support vector machine algorithms. The features are derived from the target residues and flanking protein sequences, such as amino acid types, sequences composition, physicochemical properties, position-specific scoring matrix, phylogenetic entropy and the number of possible codons of target residues...
March 2015: Interdisciplinary Sciences, Computational Life Sciences
Alexander Kautzky, Pia Baldinger, Daniel Souery, Stuart Montgomery, Julien Mendlewicz, Joseph Zohar, Alessandro Serretti, Rupert Lanzenberger, Siegfried Kasper
For over a decade, the European Group for the Study of Resistant Depression (GSRD) has examined single nucleotide polymorphisms (SNP) and clinical parameters in regard to treatment outcome. However, an interaction based model combining these factors has not been established yet. Regarding the low effect of individual SNPs, a model investigating the interactive role of SNPs and clinical variables in treatment-resistant depression (TRD) seems auspicious. Thus 225 patients featured in previous work of the GSRD were enrolled in this investigation...
April 2015: European Neuropsychopharmacology: the Journal of the European College of Neuropsychopharmacology
Thanh-Tung Nguyen, Joshua Huang, Qingyao Wu, Thuy Nguyen, Mark Li
BACKGROUND: Single-nucleotide polymorphisms (SNPs) selection and identification are the most important tasks in Genome-wide association data analysis. The problem is difficult because genome-wide association data is very high dimensional and a large portion of SNPs in the data is irrelevant to the disease. Advanced machine learning methods have been successfully used in Genome-wide association studies (GWAS) for identification of genetic variants that have relatively big effects in some common, complex diseases...
2015: BMC Genomics
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"