Read by QxMD icon Read

Annals of Applied Statistics

Chi Song, Xiaoyi Min, Heping Zhang
The chromosome copy number variation (CNV) is the deviation of genomic regions from their normal copy number states, which may associate with many human diseases. Current genetic studies usually collect hundreds to thousands of samples to study the association between CNV and diseases. CNVs can be called by detecting the change-points in mean for sequences of array-based intensity measurements. Although multiple samples are of interest, the majority of the available CNV calling methods are single sample based...
December 2016: Annals of Applied Statistics
Amanda A Koepke, Ira M Longini, M Elizabeth Halloran, Jon Wakefield, Vladimir N Minin
Despite seasonal cholera outbreaks in Bangladesh, little is known about the relationship between environmental conditions and cholera cases. We seek to develop a predictive model for cholera outbreaks in Bangladesh based on environmental predictors. To do this, we estimate the contribution of environmental variables, such as water depth and water temperature, to cholera outbreaks in the context of a disease transmission model. We implement a method which simultaneously accounts for disease dynamics and environmental variables in a Susceptible-Infected-Recovered-Susceptible (SIRS) model...
June 2016: Annals of Applied Statistics
Wanghuan Chu, Runze Li, Matthew Reimherr
Motivated by an empirical analysis of the Childhood Asthma Management Project, CAMP, we introduce a new screening procedure for varying coefficient models with ultrahigh dimensional longitudinal predictor variables. The performance of the proposed procedure is investigated via Monte Carlo simulation. Numerical comparisons indicate that it outperforms existing ones substantially, resulting in significant improvements in explained variability and prediction error. Applying these methods to CAMP, we are able to find a number of potentially important genetic mutations related to lung function, several of which exhibit interesting nonlinear patterns around puberty...
June 2016: Annals of Applied Statistics
Ailin Fan, Wenbin Lu, Rui Song
Variable selection for optimal treatment regime in a clinical trial or an observational study is getting more attention. Most existing variable selection techniques focused on selecting variables that are important for prediction, therefore some variables that are poor in prediction but are critical for decision-making may be ignored. A qualitative interaction of a variable with treatment arises when treatment effect changes direction as the value of this variable varies. The qualitative interaction indicates the importance of this variable for decision-making...
March 2016: Annals of Applied Statistics
Mengjie Chen, Haifan Lin, Hongyu Zhao
Histone modification is a vital epigenetic mechanism for transcriptional control in eukaryotes. High-throughput techniques have enabled whole-genome analysis of histone modifications in recent years. However, most studies assume one combination of histone modification invariantly translates to one transcriptional output regardless of local chromatin environment. In this study we hypothesize that, the genome is organized into local domains that manifest similar enrichment pattern of histone modification, which leads to orchestrated regulation of expression of genes with relevant biological functions...
March 2016: Annals of Applied Statistics
Ying Liu, Yuanjia Wang, Yang Feng, Melanie M Wall
We propose a Multiple Imputation Random Lasso (mirl) method to select important variables and to predict the outcome for an epidemiological study of Eating and Activity in Teens. In this study 80% of individuals have at least one variable missing. Therefore, using variable selection methods developed for complete data after listwise deletion substantially reduces prediction power. Recent work on prediction models in the presence of incomplete data cannot adequately account for large numbers of variables with arbitrary missing patterns...
March 2016: Annals of Applied Statistics
L E Wang, Pamela A Shaw, Hansie M Mathelier, Stephen E Kimmel, Benjamin French
The availability of data from electronic health records facilitates the development and evaluation of risk-prediction models, but estimation of prediction accuracy could be limited by outcome misclassification, which can arise if events are not captured. We evaluate the robustness of prediction accuracy summaries, obtained from receiver operating characteristic curves and risk-reclassification methods, if events are not captured (i.e., "false negatives"). We derive estimators for sensitivity and specificity if misclassification is independent of marker values...
March 2016: Annals of Applied Statistics
Brandon George, Thomas Denney, Himanshu Gupta, Louis Dell'Italia, Inmaculada Aban
Longitudinal imaging studies have both spatial and temporal correlation among the multiple outcome measurements from a subject. Statistical methods of analysis must properly account for this autocorrelation. In this work we discuss how a linear model with a separable parametric correlation structure could be used to analyze data from such a study. The goal of this paper is to provide an easily understood description of how such a model works and discuss how it can be applied to real data. Model assumptions are discussed and the process of selecting a working correlation structure is thoroughly discussed...
March 2016: Annals of Applied Statistics
Colin J Worby, Philip D O'Neill, Theodore Kypraios, Julie V Robotham, Daniela De Angelis, Edward J P Cartwright, Sharon J Peacock, Ben S Cooper
Whole genome sequencing of pathogens from multiple hosts in an epidemic offers the potential to investigate who infected whom with unparalleled resolution, potentially yielding important insights into disease dynamics and the impact of control measures. We considered disease outbreaks in a setting with dense genomic sampling, and formulated stochastic epidemic models to investigate person-to-person transmission, based on observed genomic and epidemiological data. We constructed models in which the genetic distance between sampled genotypes depends on the epidemiological relationship between the hosts...
March 2016: Annals of Applied Statistics
Ick Hoon Jin, Ying Yuan, Dipankar Bandyopadhyay
Research in dental caries generates data with two levels of hierarchy: that of a tooth overall and that of the different surfaces of the tooth. The outcomes often exhibit spatial referencing among neighboring teeth and surfaces, i.e., the disease status of a tooth or surface might be influenced by the status of a set of proximal teeth/surfaces. Assessments of dental caries (tooth decay) at the tooth level yield binary outcomes indicating the presence/absence of teeth, and trinary outcomes at the surface level indicating healthy, decayed, or filled surfaces...
2016: Annals of Applied Statistics
Paul Bendich, J S Marron, Ezra Miller, Alex Pieloch, Sean Skwerer
New representations of tree-structured data objects, using ideas from topological data analysis, enable improved statistical analyses of a population of brain artery trees. A number of representations of each data tree arise from persistence diagrams that quantify branching and looping of vessels at multiple scales. Novel approaches to the statistical analysis, through various summaries of the persistence diagrams, lead to heightened correlations with covariates such as age and sex, relative to earlier analyses of this data set...
2016: Annals of Applied Statistics
Laina D Mercer, Jon Wakefield, Athena Pantazis, Angelina M Lutambi, Honorati Masanja, Samuel Clark
Many people living in low and middle-income countries are not covered by civil registration and vital statistics systems. Consequently, a wide variety of other types of data including many household sample surveys are used to estimate health and population indicators. In this paper we combine data from sample surveys and demographic surveillance systems to produce small area estimates of child mortality through time. Small area estimates are necessary to understand geographical heterogeneity in health indicators when full-coverage vital statistics are not available...
December 2015: Annals of Applied Statistics
Lynne Steuerle Schofield
This paper represents a methodological-substantive synergy. A new model, the Mixed Effects Structural Equations (MESE) model which combines structural equations modeling and item response theory is introduced to attend to measurement error bias when using several latent variables as predictors in generalized linear models. The paper investigates racial and gender disparities in STEM retention in higher education. Using the MESE model with 1997 National Longitudinal Survey of Youth data, I find prior mathematics proficiency and personality have been previously underestimated in the STEM retention literature...
December 1, 2015: Annals of Applied Statistics
Eunjee Lee, Hongtu Zhu, Dehan Kong, Yalin Wang, Kelly Sullivan Giovanello, Joseph G Ibrahim
The aim of this paper is to develop a Bayesian functional linear Cox regression model (BFLCRM) with both functional and scalar covariates. This new development is motivated by establishing the likelihood of conversion to Alzheimer's disease (AD) in 346 patients with mild cognitive impairment (MCI) enrolled in the Alzheimer's Disease Neuroimaging Initiative 1 (ADNI-1) and the early markers of conversion. These 346 MCI patients were followed over 48 months, with 161 MCI participants progressing to AD at 48 months...
December 2015: Annals of Applied Statistics
Luke B Smith, Montserrat Fuentes, Penny Gordon-Larsen, Brian J Reich
Cardiometabolic diseases have substantially increased in China in the past 20 years and blood pressure is a primary modifiable risk factor. Using data from the China Health and Nutrition Survey we examine blood pressure trends in China from 1991 to 2009, with a concentration on age cohorts and urbanicity. Very large values of blood pressure are of interest, so we model the conditional quantile functions of systolic and diastolic blood pressure. This allows the covariate effects in the middle of the distribution to vary from those in the upper tail, the focal point of our analysis...
September 2015: Annals of Applied Statistics
Peter D Hoff
A fundamental aspect of relational data, such as from a social network, is the possibility of dependence among the relations. In particular, the relations between members of one pair of nodes may have an effect on the relations between members of another pair. This article develops a type of regression model to estimate such effects in the context of longitudinal and multivariate relational data, or other data that can be represented in the form of a tensor. The model is based on a general multilinear tensor regression model, a special case of which is a tensor autoregression model in which the tensor of relations at one time point are parsimoniously regressed on relations from previous time points...
September 2015: Annals of Applied Statistics
Rachael Maltiel, Adrian E Raftery, Tyler H McCormick, Aaron J Baraff
We develop methods for estimating the size of hard-to-reach populations from data collected using network-based questions on standard surveys. Such data arise by asking respondents how many people they know in a specific group (e.g. people named Michael, intravenous drug users). The Network Scale up Method (NSUM) is a tool for producing population size estimates using these indirect measures of respondents' networks. Killworth et al. (1998a,b) proposed maximum likelihood estimators of population size for a fixed effects model in which respondents' degrees or personal network sizes are treated as fixed...
September 2015: Annals of Applied Statistics
Zhao-Hua Lu, Sy-Miin Chow, Andrew Sherwood, Hongtu Zhu
Ambulatory cardiovascular (CV) measurements provide valuable insights into individuals' health conditions in "real-life," everyday settings. Current methods of modeling ambulatory CV data do not consider the dynamic characteristics of the full data set and their relationships with covariates such as caffeine use and stress. We propose a stochastic differential equation (SDE) in the form of a dual nonlinear Ornstein-Uhlenbeck (OU) model with person-specific covariates to capture the morning surge and nighttime dipping dynamics of ambulatory CV data...
September 2015: Annals of Applied Statistics
Irina Ostrovnaya, Venkatraman E Seshan, Colin B Begg
A major challenge for cancer pathologists is to determine whether a new tumor in a patient with cancer is a metastasis or an independent occurrence of the disease. In recent years numerous studies have evaluated pairs of tumor specimens to examine the similarity of the somatic characteristics of the tumors and to test for clonal relatedness. As the landscape of mutation testing has evolved a number of statistical methods for determining clonality have developed, notably for comparing losses of heterozygosity at candidate markers, and for comparing copy number profiles...
September 2015: Annals of Applied Statistics
Thomas A Murray, Brian P Hobbs, Bradley P Carlin
Conventional approaches to statistical inference preclude structures that facilitate incorporation of supplemental information acquired from similar circumstances. For example, the analysis of data obtained using perfusion computed tomography to characterize functional imaging biomarkers in cancerous regions of the liver can benefit from partially informative data collected concurrently in non-cancerous regions. This paper presents a hierarchical model structure that leverages all available information about a curve, using penalized splines, while accommodating important between-source features...
September 2015: Annals of Applied Statistics
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"