Read by QxMD icon Read

Journal of the American Statistical Association

Nilanjan Chatterjee, Yi-Hau Chen, Paige Maas, Raymond J Carroll
Information from various public and private data sources of extremely large sample sizes are now increasingly available for research purposes. Statistical methods are needed for utilizing information from such big data sources while analyzing data from individual studies that may collect more detailed information required for addressing specific hypotheses of interest. In this article, we consider the problem of building regression models based on individual-level data from an "internal" study while utilizing summary-level information, such as information on parameters for reduced models, from an "external" big data source...
March 2016: Journal of the American Statistical Association
Mengjie Chen, Zhao Ren, Hongyu Zhao, Harrison Zhou
A tuning-free procedure is proposed to estimate the covariate-adjusted Gaussian graphical model. For each finite subgraph, this estimator is asymptotically normal and efficient. As a consequence, a confidence interval can be obtained for each edge. The procedure enjoys easy implementation and efficient computation through parallel estimation on subgraphs or edges. We further apply the asymptotic normality result to perform support recovery through edge-wise adaptive thresholding. This support recovery procedure is called ANTAC, standing for Asymptotically Normal estimation with Thresholding after Adjusting Covariates...
March 2016: Journal of the American Statistical Association
Michalis K Titsias, Christopher C Holmes, Christopher Yau
Hidden Markov models (HMMs) are one of the most widely used statistical methods for analyzing sequence data. However, the reporting of output from HMMs has largely been restricted to the presentation of the most-probable (MAP) hidden state sequence, found via the Viterbi algorithm, or the sequence of most probable marginals using the forward-backward algorithm. In this article, we expand the amount of information we could obtain from the posterior distribution of an HMM by introducing linear-time dynamic programming recursions that, conditional on a user-specified constraint in the number of segments, allow us to (i) find MAP sequences, (ii) compute posterior probabilities, and (iii) simulate sample paths...
January 2, 2016: Journal of the American Statistical Association
Ci-Ren Jiang, John A D Aston, Jane-Ling Wang
Positron emission tomography (PET) is an imaging technique which can be used to investigate chemical changes in human biological processes such as cancer development or neurochemical reactions. Most dynamic PET scans are currently analyzed based on the assumption that linear first-order kinetics can be used to adequately describe the system under observation. However, there has recently been strong evidence that this is not the case. To provide an analysis of PET data which is free from this compartmental assumption, we propose a nonparametric deconvolution and analysis model for dynamic PET data based on functional principal component analysis...
January 2, 2016: Journal of the American Statistical Association
Lisa M Pham, Luis Carvalho, Scott Schaus, Eric D Kolaczyk
Cellular response to a perturbation is the result of a dynamic system of biological variables linked in a complex network. A major challenge in drug and disease studies is identifying the key factors of a biological network that are essential in determining the cell's fate. Here our goal is the identification of perturbed pathways from high-throughput gene expression data. We develop a three-level hierarchical model, where (i) the first level captures the relationship between gene expression and biological pathways using confirmatory factor analysis, (ii) the second level models the behavior within an underlying network of pathways induced by an unknown perturbation using a conditional autoregressive model, and (iii) the third level is a spike-and-slab prior on the perturbations...
2016: Journal of the American Statistical Association
Aaron Fisher, Brian Caffo, Brian Schwartz, Vadim Zipunnikov
Many have suggested a bootstrap procedure for estimating the sampling variability of principal component analysis (PCA) results. However, when the number of measurements per subject (p) is much larger than the number of subjects (n), calculating and storing the leading principal components from each bootstrap sample can be computationally infeasible. To address this, we outline methods for fast, exact calculation of bootstrap principal components, eigenvalues, and scores. Our methods leverage the fact that all bootstrap samples occupy the same n-dimensional subspace as the original sample...
2016: Journal of the American Statistical Association
Zhiguang Huo, Ying Ding, Silvia Liu, Steffi Oesterreich, George Tseng
Disease phenotyping by omics data has become a popular approach that potentially can lead to better personalized treatment. Identifying disease subtypes via unsupervised machine learning is the first step towards this goal. In this paper, we extend a sparse K-means method towards a meta-analytic framework to identify novel disease subtypes when expression profiles of multiple cohorts are available. The lasso regularization and meta-analysis identify a unique set of gene features for subtype characterization...
2016: Journal of the American Statistical Association
T Tony Cai, Weidong Liu
Multiple testing of correlations arises in many applications including gene coexpression network analysis and brain connectivity analysis. In this paper, we consider large scale simultaneous testing for correlations in both the one-sample and two-sample settings. New multiple testing procedures are proposed and a bootstrap method is introduced for estimating the proportion of the nulls falsely rejected among all the true nulls. The properties of the proposed procedures are investigated both theoretically and numerically...
2016: Journal of the American Statistical Association
Yuan Jiang, Yunxiao He, Heping Zhang
LASSO is a popular statistical tool often used in conjunction with generalized linear models that can simultaneously select variables and estimate parameters. When there are many variables of interest, as in current biological and biomedical studies, the power of LASSO can be limited. Fortunately, so much biological and biomedical data have been collected and they may contain useful information about the importance of certain variables. This paper proposes an extension of LASSO, namely, prior LASSO (pLASSO), to incorporate that prior information into penalized generalized linear models...
2016: Journal of the American Statistical Association
Chao Du, Chu-Lan Michael Kao, S C Kou
This paper studies the estimation of stepwise signal. To determine the number and locations of change-points of the stepwise signal, we formulate a maximum marginal likelihood estimator, which can be computed with a quadratic cost using dynamic programming. We carry out extensive investigation on the choice of the prior distribution and study the asymptotic properties of the maximum marginal likelihood estimator. We propose to treat each possible set of change-points equally and adopt an empirical Bayes approach to specify the prior distribution of segment parameters...
2016: Journal of the American Statistical Association
Xiaoyan Sun, Limin Peng, Yijian Huang, HuiChuan J Lai
In survival analysis, quantile regression has become a useful approach to account for covariate effects on the distribution of an event time of interest. In this paper, we discuss how quantile regression can be extended to model counting processes, and thus lead to a broader regression framework for survival data. We specifically investigate the proposed modeling of counting processes for recurrent events data. We show that the new recurrent events model retains the desirable features of quantile regression such as easy interpretation and good model flexibility, while accommodating various observation schemes encountered in observational studies...
2016: Journal of the American Statistical Association
Jianqing Fan, Yang Feng, Jiancheng Jiang, Xin Tong
We propose a high dimensional classification method that involves nonparametric feature augmentation. Knowing that marginal density ratios are the most powerful univariate classifiers, we use the ratio estimates to transform the original feature measurements. Subsequently, penalized logistic regression is invoked, taking as input the newly transformed or augmented features. This procedure trains models equipped with local complexity and global simplicity, thereby avoiding the curse of dimensionality while creating a flexible nonlinear decision boundary...
2016: Journal of the American Statistical Association
Anirban Bhattacharya, Debdeep Pati, Natesh S Pillai, David B Dunson
Penalized regression methods, such as L 1 regularization, are routinely used in high-dimensional applications, and there is a rich literature on optimality properties under sparsity assumptions. In the Bayesian paradigm, sparsity is routinely induced through two-component mixture priors having a probability mass at zero, but such priors encounter daunting computational problems in high dimensions. This has motivated continuous shrinkage priors, which can be expressed as global-local scale mixtures of Gaussians, facilitating computation...
December 1, 2015: Journal of the American Statistical Association
Michael G Hudgens
No abstract text is available yet for this article.
December 2015: Journal of the American Statistical Association
Zifang Guo, Lexin Li, Wenbin Lu, Bing Li
The family of sufficient dimension reduction (SDR) methods that produce informative combinations of predictors, or indices, are particularly useful for high dimensional regression analysis. In many such analyses, it becomes increasingly common that there is available a priori subject knowledge of the predictors; e.g., they belong to different groups. While many recent SDR proposals have greatly expanded the scope of the methods' applicability, how to effectively incorporate the prior predictor structure information remains a challenge...
December 1, 2015: Journal of the American Statistical Association
Chao Huang, Martin Styner, Hongtu Zhu
An important goal in image analysis is to cluster and recognize objects of interest according to the shapes of their boundaries. Clustering such objects faces at least four major challenges including a curved shape space, a high-dimensional feature space, a complex spatial correlation structure, and shape variation associated with some covariates (e.g., age or gender). The aim of this paper is to develop a penalized model-based clustering framework to cluster landmark-based planar shape data, while explicitly addressing these challenges...
November 7, 2015: Journal of the American Statistical Association
Rajen D Shah, Richard J Samworth
No abstract text is available yet for this article.
October 2, 2015: Journal of the American Statistical Association
David Azriel, Armin Schwartzman
Motivated by the advent of high dimensional highly correlated data, this work studies the limit behavior of the empirical cumulative distribution function (ecdf) of standard normal random variables under arbitrary correlation. First, we provide a necessary and sufficient condition for convergence of the ecdf to the standard normal distribution. Next, under general correlation, we show that the ecdf limit is a random, possible infinite, mixture of normal distribution functions that depends on a number of latent variables and can serve as an asymptotic approximation to the ecdf in high dimensions...
September 1, 2015: Journal of the American Statistical Association
Beom Seuk Hwang, Zhen Chen
In estimating ROC curves of multiple tests, some a priori constraints may exist, either between the healthy and diseased populations within a test or between tests within a population. In this paper, we proposed an integrated modeling approach for ROC curves that jointly accounts for stochastic and variability orders. The stochastic order constrains the distributional centers of the diseased and healthy populations within a test, while the variability order constrains the distributional spreads of the tests within each of the populations...
September 1, 2015: Journal of the American Statistical Association
Michael Schweinberger, Mark S Handcock
Dependent phenomena, such as relational, spatial and temporal phenomena, tend to be characterized by local dependence in the sense that units which are close in a well-defined sense are dependent. In contrast with spatial and temporal phenomena, though, relational phenomena tend to lack a natural neighbourhood structure in the sense that it is unknown which units are close and thus dependent. Owing to the challenge of characterizing local dependence and constructing random graph models with local dependence, many conventional exponential family random graph models induce strong dependence and are not amenable to statistical inference...
June 1, 2015: Journal of the American Statistical Association
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"