Read by QxMD icon Read

Statistical Applications in Genetics and Molecular Biology

Nimisha Chaturvedi, Renée X de Menezes, Jelle J Goeman, Wessel van Wieringen
Integrative analysis of copy number and gene expression data can help in understanding the cis and trans effect of copy number aberrations on transcription levels of genes involved in a pathway. To analyse how these copy number mediated gene-gene interactions differ between groups of samples we propose a new method, named dNET. Our method uses ridge regression to model the network topology involving one gene's expression level, its gene dosage and the expression levels of other genes in the network. The interaction parameters are estimated by fitting the model per gene for all samples together...
July 31, 2018: Statistical Applications in Genetics and Molecular Biology
Royi Jacobovic
The prediction of cancer prognosis and metastatic potential immediately after the initial diagnoses is a major challenge in current clinical research. The relevance of such a signature is clear, as it will free many patients from the agony and toxic side-effects associated with the adjuvant chemotherapy automatically and sometimes carelessly subscribed to them. Motivated by this issue, several previous works presented a Bayesian model which led to the following conclusion: thousands of samples are needed to generate a robust gene list for predicting outcome...
July 14, 2018: Statistical Applications in Genetics and Molecular Biology
Naveen K Bansal, Mehdi Maadooliat, Steven J Schrodi
We consider a multiple hypotheses problem with directional alternatives in a decision theoretic framework. We obtain an empirical Bayes rule subject to a constraint on mixed directional false discovery rate (mdFDR≤α) under the semiparametric setting where the distribution of the test statistic is parametric, but the prior distribution is nonparametric. We proposed separate priors for the left tail and right tail alternatives as it may be required for many applications. The proposed Bayes rule is compared through simulation against rules proposed by Benjamini and Yekutieli and Efron...
July 5, 2018: Statistical Applications in Genetics and Molecular Biology
Hsin-Hsiung Huang, Shuai Hao, Saul Alarcon, Jie Yang
In this paper, we propose a statistical classification method based on discriminant analysis using the first and second moments of positions of each nucleotide of the genome sequences as features, and compare its performances with other classification methods as well as natural vector for comparative genomic analysis. We examine the normality of the proposed features. The statistical classification models used including linear discriminant analysis, quadratic discriminant analysis, diagonal linear discriminant analysis, k-nearest-neighbor classifier, logistic regression, support vector machines, and classification trees...
June 30, 2018: Statistical Applications in Genetics and Molecular Biology
Gustavo H Esteves, Luiz F L Reis
MOTIVATION: Gene expression data analysis is of great importance for modern molecular biology, given our ability to measure the expression profiles of thousands of genes and enabling studies rooted in systems biology. In this work, we propose a simple statistical model for the activation measuring of gene regulatory networks, instead of the traditional gene co-expression networks. RESULTS: We present the mathematical construction of a statistical procedure for testing hypothesis regarding gene regulatory network activation...
June 13, 2018: Statistical Applications in Genetics and Molecular Biology
Jere Koskela
We introduce a low dimensional function of the site frequency spectrum that is tailor-made for distinguishing coalescent models with multiple mergers from Kingman coalescent models with population growth, and use this function to construct a hypothesis test between these model classes. The null and alternative sampling distributions of the statistic are intractable, but its low dimensionality renders them amenable to Monte Carlo estimation. We construct kernel density estimates of the sampling distributions based on simulated data, and show that the resulting hypothesis test dramatically improves on the statistical power of a current state-of-the-art method...
June 13, 2018: Statistical Applications in Genetics and Molecular Biology
Berit Lindum Waltoft, Asger Hobolth
Changes in population size is a useful quantity for understanding the evolutionary history of a species. Genetic variation within a species can be summarized by the site frequency spectrum (SFS). For a sample of size n, the SFS is a vector of length n - 1 where entry i is the number of sites where the mutant base appears i times and the ancestral base appears n - i times. We present a new method, CubSFS, for estimating the changes in population size of a panmictic population from an observed SFS. First, we provide a straightforward proof for the expression of the expected site frequency spectrum depending only on the population size...
June 11, 2018: Statistical Applications in Genetics and Molecular Biology
Jeffrey J Gory, Radu Herbei, Laura S Kubatko
The increasing availability of population-level allele frequency data across one or more related populations necessitates the development of methods that can efficiently estimate population genetics parameters, such as the strength of selection acting on the population(s), from such data. Existing methods for this problem in the setting of the Wright-Fisher diffusion model are primarily likelihood-based, and rely on numerical approximation for likelihood computation and on bootstrapping for assessment of variability in the resulting estimates, requiring extensive computation...
June 6, 2018: Statistical Applications in Genetics and Molecular Biology
Nele Cosemans, Peter Claes, Nathalie Brison, Joris Robert Vermeesch, Hilde Peeters
Arrays based on single nucleotide polymorphisms (SNPs) have been successful for the large scale discovery of copy number variants (CNVs). However, current CNV calling algorithms still have limitations in detecting CNVs with high specificity and sensitivity, especially in case of small (<100 kb) CNVs. Therefore, this study presents a simple statistical analysis to evaluate CNV calls from SNP arrays in order to improve the noise-robustness of existing CNV calling algorithms. The proposed approach estimates local noise of log R ratios and returns the probability that a certain observation is different from this log R ratio noise level...
April 28, 2018: Statistical Applications in Genetics and Molecular Biology
Jialin Zhang, Chen Chen
Zhang, Z. and Zheng, L. (2015): "A mutual information estimator with exponentially decaying bias," Stat. Appl. Genet. Mol. Biol., 14, 243-252, proposed a nonparametric estimator of mutual information developed in entropic perspective, and demonstrated that it has much smaller bias than the plugin estimator yet with the same asymptotic normality under certain conditions. However it is incorrectly suggested in their article that the asymptotic normality could be used for testing independence between two random elements on a joint alphabet...
March 30, 2018: Statistical Applications in Genetics and Molecular Biology
Jean-Eudes Dazard, Hemant Ishwaran, Rajeev Mehlotra, Aaron Weinberg, Peter Zimmerman
Unraveling interactions among variables such as genetic, clinical, demographic and environmental factors is essential to understand the development of common and complex diseases. To increase the power to detect such variables interactions associated with clinical time-to-events outcomes, we borrowed established concepts from random survival forest (RSF) models. We introduce a novel RSF-based pairwise interaction estimator and derive a randomization method with bootstrap confidence intervals for inferring interaction significance...
February 17, 2018: Statistical Applications in Genetics and Molecular Biology
Cen Wu, Ping-Shou Zhong, Yuehua Cui
Gene-environment (G×E) interaction plays a pivotal role in understanding the genetic basis of complex disease. When environmental factors are measured continuously, one can assess the genetic sensitivity over different environmental conditions on a disease trait. Motivated by the increasing awareness of gene set based association analysis over single variant based approaches, we proposed an additive varying-coefficient model to jointly model variants in a genetic system. The model allows us to examine how variants in a gene set are moderated by an environment factor to affect a disease phenotype...
February 8, 2018: Statistical Applications in Genetics and Molecular Biology
Jiehuan Sun, Jose D Herazo-Maya, Xiu Huang, Naftali Kaminski, Hongyu Zhao
Longitudinal gene expression profiles of subjects are collected in some clinical studies to monitor disease progression and understand disease etiology. The identification of gene sets that have coordinated changes with relevant clinical outcomes over time from these data could provide significant insights into the molecular basis of disease progression and lead to better treatments. In this article, we propose a Distance-Correlation based Gene Set Analysis (dcGSA) method for longitudinal gene expression data...
February 5, 2018: Statistical Applications in Genetics and Molecular Biology
Marco Marozzi
In biomedical research, multiple endpoints are commonly analyzed in "omics" fields like genomics, proteomics and metabolomics. Traditional methods designed for low-dimensional data either perform poorly or are not applicable when analyzing high-dimensional data whose dimension is generally similar to, or even much larger than, the number of subjects. The complex biochemical interplay between hundreds (or thousands) of endpoints is reflected by complex dependence relations. The aim of the paper is to propose tests that are very suitable for analyzing omics data because they do not require the normality assumption, are powerful also for small sample sizes, in the presence of complex dependence relations among endpoints, and when the number of endpoints is much larger than the number of subjects...
January 30, 2018: Statistical Applications in Genetics and Molecular Biology
Colleen Nooney, Stuart Barber, Arief Gusnanto, Walter R Gilks
We introduce a new method to test efficiently for cospeciation in tritrophic systems. Our method utilises an analogy with electrical circuit theory to reduce higher order systems into bitrophic data sets that retain the information of the original system. We use a sophisticated permutation scheme that weights interactions between two trophic layers based on their connection to the third layer in the system. Our method has several advantages compared to the method of Mramba et al. [Mramba, L. K., S. Barber, K...
November 27, 2017: Statistical Applications in Genetics and Molecular Biology
Christopher McMahan, James Baurley, William Bridges, Chase Joyner, Muhamad Fitra Kacamarga, Robert Lund, Carissa Pardamean, Bens Pardamean
Genomic studies of plants often seek to identify genetic factors associated with desirable traits. The process of evaluating genetic markers one by one (i.e. a marginal analysis) may not identify important polygenic and environmental effects. Further, confounding due to growing conditions/factors and genetic similarities among plant varieties may influence conclusions. When developing new plant varieties to optimize yield or thrive in future adverse conditions (e.g. flood, drought), scientists seek a complete understanding of how the factors influence desirable traits...
November 27, 2017: Statistical Applications in Genetics and Molecular Biology
Johanna Bertl, Gregory Ewing, Carolin Kosiol, Andreas Futschik
In many population genetic problems, parameter estimation is obstructed by an intractable likelihood function. Therefore, approximate estimation methods have been developed, and with growing computational power, sampling-based methods became popular. However, these methods such as Approximate Bayesian Computation (ABC) can be inefficient in high-dimensional problems. This led to the development of more sophisticated iterative estimation methods like particle filters. Here, we propose an alternative approach that is based on stochastic approximation...
November 27, 2017: Statistical Applications in Genetics and Molecular Biology
Panagiotis Papastamoulis, Magnus Rattray
Next generation sequencing allows the identification of genes consisting of differentially expressed transcripts, a term which usually refers to changes in the overall expression level. A specific type of differential expression is differential transcript usage (DTU) and targets changes in the relative within gene expression of a transcript. The contribution of this paper is to: (a) extend the use of cjBitSeq to the DTU context, a previously introduced Bayesian model which is originally designed for identifying changes in overall expression levels and (b) propose a Bayesian version of DRIMSeq, a frequentist model for inferring DTU...
November 27, 2017: Statistical Applications in Genetics and Molecular Biology
Elena Szefer, Donghuan Lu, Farouk Nathoo, Mirza Faisal Beg, Jinko Graham
Using publicly-available data from the Alzheimer's Disease Neuroimaging Initiative, we investigate the joint association between single-nucleotide polymorphisms (SNPs) in previously established linkage regions for Alzheimer's disease (AD) and rates of decline in brain structure. In an initial, discovery stage of analysis, we applied a weighted RV test to assess the association between 75,845 SNPs in the Alzgene linkage regions and rates of change in structural MRI measurements for 56 brain regions affected by AD, in 632 subjects...
November 27, 2017: Statistical Applications in Genetics and Molecular Biology
Ekua Kotoka, Megan Orr
RNA-Seq is a developing technology for generating gene expression data by directly sequencing mRNA molecules in a sample. RNA-Seq data consist of counts of reads recorded to a particular gene that are often used to identify differentially expressed (DE) genes. A common statistical method used to analyze RNA-Seq data is Significance Analysis of Microarray with emphasis on RNA-Seq data (SAMseq). SAMseq is a nonparametric method that uses a resampling technique to account for differences in sequencing depths when identifying DE genes...
November 27, 2017: Statistical Applications in Genetics and Molecular Biology
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"