Read by QxMD icon Read

Annals of Applied Statistics

Gwenaƫl G R Leday, Mathisca C M de Gunst, Gino B Kpogbezan, Aad W van der Vaart, Wessel N van Wieringen, Mark A van de Wiel
Reconstructing a gene network from high-throughput molecular data is an important but challenging task, as the number of parameters to estimate easily is much larger than the sample size. A conventional remedy is to regularize or penalize the model likelihood. In network models, this is often done locally in the neighbourhood of each node or gene. However, estimation of the many regularization parameters is often difficult and can result in large statistical uncertainties. In this paper we propose to combine local regularization with global shrinkage of the regularization parameters to borrow strength between genes and improve inference...
March 2017: Annals of Applied Statistics
Ran Shi, Ying Guo
Human brains perform tasks via complex functional networks consisting of separated brain regions. A popular approach to characterize brain functional networks in fMRI studies is independent component analysis (ICA), which is a powerful method to reconstruct latent source signals from their linear mixtures. In many fMRI studies, an important goal is to investigate how brain functional networks change according to specific clinical and demographic variabilities. Existing ICA methods, however, cannot directly incorporate covariate effects in ICA decomposition...
December 2016: Annals of Applied Statistics
Kun Chen, Eric A Hoffman, Indu Seetharaman, Feiran Jiao, Ching-Long Lin, Kung-Sik Chan
The human lung airway is a complex inverted tree-like structure. Detailed airway measurements can be extracted from MDCT-scanned lung images, such as segmental wall thickness, airway diameter, parent-child branch angles, etc. The wealth of lung airway data provides a unique opportunity for advancing our understanding of the fundamental structure-function relationships within the lung. An important problem is to construct and identify important lung airway features in normal subjects and connect these to standardized pulmonary function test results such as FEV1%...
December 2016: Annals of Applied Statistics
Chi Song, Xiaoyi Min, Heping Zhang
The chromosome copy number variation (CNV) is the deviation of genomic regions from their normal copy number states, which may associate with many human diseases. Current genetic studies usually collect hundreds to thousands of samples to study the association between CNV and diseases. CNVs can be called by detecting the change-points in mean for sequences of array-based intensity measurements. Although multiple samples are of interest, the majority of the available CNV calling methods are single sample based...
December 2016: Annals of Applied Statistics
Belinda Phipson, Stanley Lee, Ian J Majewski, Warren S Alexander, Gordon K Smyth
One of the most common analysis tasks in genomic research is to identify genes that are differentially expressed (DE) between experimental conditions. Empirical Bayes (EB) statistical tests using moderated genewise variances have been very effective for this purpose, especially when the number of biological replicate samples is small. The EB procedures can however be heavily influenced by a small number of genes with very large or very small variances. This article improves the differential expression tests by robustifying the hyperparameter estimation procedure...
June 2016: Annals of Applied Statistics
Amanda A Koepke, Ira M Longini, M Elizabeth Halloran, Jon Wakefield, Vladimir N Minin
Despite seasonal cholera outbreaks in Bangladesh, little is known about the relationship between environmental conditions and cholera cases. We seek to develop a predictive model for cholera outbreaks in Bangladesh based on environmental predictors. To do this, we estimate the contribution of environmental variables, such as water depth and water temperature, to cholera outbreaks in the context of a disease transmission model. We implement a method which simultaneously accounts for disease dynamics and environmental variables in a Susceptible-Infected-Recovered-Susceptible (SIRS) model...
June 2016: Annals of Applied Statistics
Wanghuan Chu, Runze Li, Matthew Reimherr
Motivated by an empirical analysis of the Childhood Asthma Management Project, CAMP, we introduce a new screening procedure for varying coefficient models with ultrahigh dimensional longitudinal predictor variables. The performance of the proposed procedure is investigated via Monte Carlo simulation. Numerical comparisons indicate that it outperforms existing ones substantially, resulting in significant improvements in explained variability and prediction error. Applying these methods to CAMP, we are able to find a number of potentially important genetic mutations related to lung function, several of which exhibit interesting nonlinear patterns around puberty...
June 2016: Annals of Applied Statistics
Ailin Fan, Wenbin Lu, Rui Song
Variable selection for optimal treatment regime in a clinical trial or an observational study is getting more attention. Most existing variable selection techniques focused on selecting variables that are important for prediction, therefore some variables that are poor in prediction but are critical for decision-making may be ignored. A qualitative interaction of a variable with treatment arises when treatment effect changes direction as the value of this variable varies. The qualitative interaction indicates the importance of this variable for decision-making...
March 2016: Annals of Applied Statistics
Mengjie Chen, Haifan Lin, Hongyu Zhao
Histone modification is a vital epigenetic mechanism for transcriptional control in eukaryotes. High-throughput techniques have enabled whole-genome analysis of histone modifications in recent years. However, most studies assume one combination of histone modification invariantly translates to one transcriptional output regardless of local chromatin environment. In this study we hypothesize that, the genome is organized into local domains that manifest similar enrichment pattern of histone modification, which leads to orchestrated regulation of expression of genes with relevant biological functions...
March 2016: Annals of Applied Statistics
Ying Liu, Yuanjia Wang, Yang Feng, Melanie M Wall
We propose a Multiple Imputation Random Lasso (mirl) method to select important variables and to predict the outcome for an epidemiological study of Eating and Activity in Teens. In this study 80% of individuals have at least one variable missing. Therefore, using variable selection methods developed for complete data after listwise deletion substantially reduces prediction power. Recent work on prediction models in the presence of incomplete data cannot adequately account for large numbers of variables with arbitrary missing patterns...
March 2016: Annals of Applied Statistics
L E Wang, Pamela A Shaw, Hansie M Mathelier, Stephen E Kimmel, Benjamin French
The availability of data from electronic health records facilitates the development and evaluation of risk-prediction models, but estimation of prediction accuracy could be limited by outcome misclassification, which can arise if events are not captured. We evaluate the robustness of prediction accuracy summaries, obtained from receiver operating characteristic curves and risk-reclassification methods, if events are not captured (i.e., "false negatives"). We derive estimators for sensitivity and specificity if misclassification is independent of marker values...
March 2016: Annals of Applied Statistics
Brandon George, Thomas Denney, Himanshu Gupta, Louis Dell'Italia, Inmaculada Aban
Longitudinal imaging studies have both spatial and temporal correlation among the multiple outcome measurements from a subject. Statistical methods of analysis must properly account for this autocorrelation. In this work we discuss how a linear model with a separable parametric correlation structure could be used to analyze data from such a study. The goal of this paper is to provide an easily understood description of how such a model works and discuss how it can be applied to real data. Model assumptions are discussed and the process of selecting a working correlation structure is thoroughly discussed...
March 2016: Annals of Applied Statistics
Colin J Worby, Philip D O'Neill, Theodore Kypraios, Julie V Robotham, Daniela De Angelis, Edward J P Cartwright, Sharon J Peacock, Ben S Cooper
Whole genome sequencing of pathogens from multiple hosts in an epidemic offers the potential to investigate who infected whom with unparalleled resolution, potentially yielding important insights into disease dynamics and the impact of control measures. We considered disease outbreaks in a setting with dense genomic sampling, and formulated stochastic epidemic models to investigate person-to-person transmission, based on observed genomic and epidemiological data. We constructed models in which the genetic distance between sampled genotypes depends on the epidemiological relationship between the hosts...
March 2016: Annals of Applied Statistics
Ick Hoon Jin, Ying Yuan, Dipankar Bandyopadhyay
Research in dental caries generates data with two levels of hierarchy: that of a tooth overall and that of the different surfaces of the tooth. The outcomes often exhibit spatial referencing among neighboring teeth and surfaces, i.e., the disease status of a tooth or surface might be influenced by the status of a set of proximal teeth/surfaces. Assessments of dental caries (tooth decay) at the tooth level yield binary outcomes indicating the presence/absence of teeth, and trinary outcomes at the surface level indicating healthy, decayed, or filled surfaces...
2016: Annals of Applied Statistics
Paul Bendich, J S Marron, Ezra Miller, Alex Pieloch, Sean Skwerer
New representations of tree-structured data objects, using ideas from topological data analysis, enable improved statistical analyses of a population of brain artery trees. A number of representations of each data tree arise from persistence diagrams that quantify branching and looping of vessels at multiple scales. Novel approaches to the statistical analysis, through various summaries of the persistence diagrams, lead to heightened correlations with covariates such as age and sex, relative to earlier analyses of this data set...
2016: Annals of Applied Statistics
Laina D Mercer, Jon Wakefield, Athena Pantazis, Angelina M Lutambi, Honorati Masanja, Samuel Clark
Many people living in low and middle-income countries are not covered by civil registration and vital statistics systems. Consequently, a wide variety of other types of data including many household sample surveys are used to estimate health and population indicators. In this paper we combine data from sample surveys and demographic surveillance systems to produce small area estimates of child mortality through time. Small area estimates are necessary to understand geographical heterogeneity in health indicators when full-coverage vital statistics are not available...
December 2015: Annals of Applied Statistics
Lynne Steuerle Schofield
This paper represents a methodological-substantive synergy. A new model, the Mixed Effects Structural Equations (MESE) model which combines structural equations modeling and item response theory is introduced to attend to measurement error bias when using several latent variables as predictors in generalized linear models. The paper investigates racial and gender disparities in STEM retention in higher education. Using the MESE model with 1997 National Longitudinal Survey of Youth data, I find prior mathematics proficiency and personality have been previously underestimated in the STEM retention literature...
December 1, 2015: Annals of Applied Statistics
Eunjee Lee, Hongtu Zhu, Dehan Kong, Yalin Wang, Kelly Sullivan Giovanello, Joseph G Ibrahim
The aim of this paper is to develop a Bayesian functional linear Cox regression model (BFLCRM) with both functional and scalar covariates. This new development is motivated by establishing the likelihood of conversion to Alzheimer's disease (AD) in 346 patients with mild cognitive impairment (MCI) enrolled in the Alzheimer's Disease Neuroimaging Initiative 1 (ADNI-1) and the early markers of conversion. These 346 MCI patients were followed over 48 months, with 161 MCI participants progressing to AD at 48 months...
December 2015: Annals of Applied Statistics
Luke B Smith, Montserrat Fuentes, Penny Gordon-Larsen, Brian J Reich
Cardiometabolic diseases have substantially increased in China in the past 20 years and blood pressure is a primary modifiable risk factor. Using data from the China Health and Nutrition Survey we examine blood pressure trends in China from 1991 to 2009, with a concentration on age cohorts and urbanicity. Very large values of blood pressure are of interest, so we model the conditional quantile functions of systolic and diastolic blood pressure. This allows the covariate effects in the middle of the distribution to vary from those in the upper tail, the focal point of our analysis...
September 2015: Annals of Applied Statistics
Peter D Hoff
A fundamental aspect of relational data, such as from a social network, is the possibility of dependence among the relations. In particular, the relations between members of one pair of nodes may have an effect on the relations between members of another pair. This article develops a type of regression model to estimate such effects in the context of longitudinal and multivariate relational data, or other data that can be represented in the form of a tensor. The model is based on a general multilinear tensor regression model, a special case of which is a tensor autoregression model in which the tensor of relations at one time point are parsimoniously regressed on relations from previous time points...
September 2015: Annals of Applied Statistics
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"