Read by QxMD icon Read

Statistica Sinica

Dan Shen, Haipeng Shen, Hongtu Zhu, J S Marron
The aim of this paper is to establish several deep theoretical properties of principal component analysis for multiple-component spike covariance models. Our new results reveal an asymptotic conical structure in critical sample eigendirections under the spike models with distinguishable (or indistinguishable) eigenvalues, when the sample size and/or the number of variables (or dimension) tend to infinity. The consistency of the sample eigenvectors relative to their population counterparts is determined by the ratio between the dimension and the product of the sample size with the spike size...
October 2016: Statistica Sinica
Esra Kürüm, Runze Li, Saul Shiffman, Weixin Yao
Motivated by an empirical analysis of ecological momentary assessment data (EMA) collected in a smoking cessation study, we propose a joint modeling technique for estimating the time-varying association between two intensively measured longitudinal responses: a continuous one and a binary one. A major challenge in joint modeling these responses is the lack of a multivariate distribution. We suggest introducing a normal latent variable underlying the binary response and factorizing the model into two components: a marginal model for the continuous response, and a conditional model for the binary response given the continuous response...
July 2016: Statistica Sinica
Xu Liu, Yuehua Cui, Runze Li
Gene-environment (G×E) interactions play key roles in many complex diseases. An increasing number of epidemiological studies have shown the combined effect of multiple environmental exposures on disease risk. However, no appropriate statistical models have been developed to conduct a rigorous assessment of such combined effects when G×E interactions are considered. In this paper, we propose a partial linear varying multi-index coefficient model (PLVMICM) to assess how multiple environmental factors act jointly to modify individual genetic risk on complex disease...
July 2016: Statistica Sinica
Wei Xiao, Wenbin Lu, Hao Helen Zhang
Time-varying coefficient Cox model has been widely studied and popularly used in survival data analysis due to its flexibility for modeling covariate effects. It is of great practical interest to accurately identify the structure of covariate effects in a time-varying coefficient Cox model, i.e. covariates with null effect, constant effect and truly time-varying effect, and estimate the corresponding regression coefficients. Combining the ideas of local polynomial smoothing and group nonnegative garrote, we develop a new penalization approach to achieve such goals...
April 2016: Statistica Sinica
Chen Xu, Shaobo Lin, Jian Fang, Runze Li
The appearance of massive data has become increasingly common in contemporary scientific research. When sample size n is huge, classical learning methods become computationally costly for the regression purpose. Recently, the orthogonal greedy algorithm (OGA) has been revitalized as an efficient alternative in the context of kernel-based statistical learning. In a learning problem, accurate and fast prediction is often of interest. This makes an appropriate termination crucial for OGA. In this paper, we propose a new termination rule for OGA via investigating its predictive performance...
April 2016: Statistica Sinica
Hana Lee, Michael G Hudgens, Jianwen Cai, Stephen R Cole
A common objective of biomedical cohort studies is assessing the effect of a time-varying treatment or exposure on a survival time. In the presence of time-varying confounders, marginal structural models fit using inverse probability weighting can be employed to obtain a consistent and asymptotically normal estimator of the causal effect of a time-varying treatment. This article considers estimation of parameters in the semiparametric marginal structural Cox model (MSCM) from a case-cohort study. Case-cohort sampling entails assembling covariate histories only for cases and a random subcohort, which can be cost effective, particularly in large cohort studies with low outcome rates...
April 2016: Statistica Sinica
Guangren Yang, Ye Yu, Runze Li, Anne Buu
Survival data with ultrahigh dimensional covariates such as genetic markers have been collected in medical studies and other fields. In this work, we propose a feature screening procedure for the Cox model with ultrahigh dimensional covariates. The proposed procedure is distinguished from the existing sure independence screening (SIS) procedures (Fan, Feng and Wu, 2010, Zhao and Li, 2012) in that the proposed procedure is based on joint likelihood of potential active predictors, and therefore is not a marginal screening procedure...
2016: Statistica Sinica
Wei Zhong, Liping Zhu, Runze Li, Hengjian Cui
We propose both a penalized quantile regression and an independence screening procedure to identify important covariates and to exclude unimportant ones for a general class of ultrahigh dimensional single-index models, in which the conditional distribution of the response depends on the covariates via a single-index structure. We observe that the linear quantile regression yields a consistent estimator of the direction of the index parameter in the single-index model. Such an observation dramatically reduces computational complexity in selecting important covariates in the single-index model...
January 2016: Statistica Sinica
Sunyoung Shin, Jason Fine, Yufeng Liu
In many problems, one has several models of interest that capture key parameters describing the distribution of the data. Partially overlapping models are taken as models in which at least one covariate effect is common to the models. A priori knowledge of such structure enables efficient estimation of all model parameters. However, in practice, this structure may be unknown. We propose adaptive composite M-estimation (ACME) for partially overlapping models using a composite loss function, which is a linear combination of loss functions defining the individual models...
January 2016: Statistica Sinica
Arijit Sinha, Zhiyi Chi, Ming-Hui Chen
Survival data often contain tied event times. Inference without careful treatment of the ties can lead to biased estimates. This paper develops the Bayesian analysis of a stochastic wear process model to fit survival data that might have a large number of ties. Under a general wear process model, we derive the likelihood of parameters. When the wear process is a Gamma process, the likelihood has a semi-closed form that allows posterior sampling to be carried out for the parameters, hence achieving model selection using Bayesian deviance information criterion...
October 2015: Statistica Sinica
Philip S Boonstra, Bhramar Mukherjee, Jeremy M G Taylor
We propose new approaches for choosing the shrinkage parameter in ridge regression, a penalized likelihood method for regularizing linear regression coefficients, when the number of observations is small relative to the number of parameters. Existing methods may lead to extreme choices of this parameter, which will either not shrink the coefficients enough or shrink them by too much. Within this "small-n, large-p" context, we suggest a correction to the common generalized cross-validation (GCV) method that preserves the asymptotic optimality of the original GCV...
July 1, 2015: Statistica Sinica
R Song, W Wang, D Zeng, M R Kosorok
A dynamic treatment regimen incorporates both accrued information and long-term effects of treatment from specially designed clinical trials. As these trials become more and more popular in conjunction with longitudinal data from clinical studies, the development of statistical inference for optimal dynamic treatment regimens is a high priority. In this paper, we propose a new machine learning framework called penalized Q-learning, under which valid statistical inference is established. We also propose a new statistical procedure: individual selection and corresponding methods for incorporating individual selection within penalized Q-learning...
July 2015: Statistica Sinica
Zhao Chen, Runze Li, Yan Li
Varying coefficient model has been popular in the literature. In this paper, we propose a profile least squares estimation procedure to its regression coefficients when its random error is an auto-regressive (AR) process. We further study the asymptotic properties of the proposed procedure, and establish the asymptotic normality for the resulting estimate. We show that the resulting estimate for the regression coefficients has the same asymptotic bias and variance as the local linear estimate for varying coefficient models with independent and identically distributed observations...
April 2015: Statistica Sinica
Xinyu Zhang, Guohua Zou, Raymond J Carroll
This paper proposes a model averaging method based on Kullback-Leibler distance under a homoscedastic normal error term. The resulting model average estimator is proved to be asymptotically optimal. When combining least squares estimators, the model average estimator is shown to have the same large sample properties as the Mallows model average (MMA) estimator developed by Hansen (2007). We show via simulations that, in terms of mean squared prediction error and mean squared parameter estimation error, the proposed model average estimator is more efficient than the MMA estimator and the estimator based on model selection using the corrected Akaike information criterion in small sample situations...
2015: Statistica Sinica
Lei Pang, Wenbin Lu, Huixia Judy Wang
In survival analysis, the accelerated failure time model is a useful alternative to the popular Cox proportional hazards model due to its easy interpretation. Current estimation methods for the accelerated failure time model mostly assume independent and identically distributed random errors, but in many applications the conditional variance of log survival times depend on covariates exhibiting some form of heteroscedasticity. In this paper, we develop a local Buckley-James estimator for the accelerated failure time model with heteroscedastic errors...
2015: Statistica Sinica
Xia Wang, Ming-Hui Chen, Rita C Kuo, Dipak K Dey
A Bayesian hierarchical model is developed for count data with spatial and temporal correlations as well as excessive zeros, uneven sampling intensities, and inference on missing spots. Our contribution is to develop a model on zero-inflated count data that provides flexibility in modeling spatial patterns in a dynamic manner and also improves the computational efficiency via dimension reduction. The proposed methodology is of particular importance for studying species presence and abundance in the field of ecological sciences...
January 2015: Statistica Sinica
Mihye Ahn, Haipeng Shen, Weili Lin, Hongtu Zhu
In spatial-temporal neuroimaging studies, there is an evolving literature on the analysis of functional imaging data in order to learn the intrinsic functional connectivity patterns among different brain regions. However, there are only few efficient approaches for integrating functional connectivity pattern across subjects, while accounting for spatial-temporal functional variation across multiple groups of subjects. The objective of this paper is to develop a new sparse reduced rank (SRR) modeling framework for carrying out functional connectivity analysis across multiple groups of subjects in the frequency domain...
January 2015: Statistica Sinica
A Adam Ding, Hulin Wu
We propose a new method to use a constrained local polynomial regression to estimate the unknown parameters in ordinary differential equation models with a goal of improving the smoothing-based two-stage pseudo-least squares estimate. The equation constraints are derived from the differential equation model and are incorporated into the local polynomial regression in order to estimate the unknown parameters in the differential equation model. We also derive the asymptotic bias and variance of the proposed estimator...
October 2014: Statistica Sinica
Hokeun Sun, Wei Lin, Rui Feng, Hongzhe Li
We consider estimation and variable selection in high-dimensional Cox regression when a prior knowledge of the relationships among the covariates, described by a network or graph, is available. A limitation of the existing methodology for survival analysis with high-dimensional genomic data is that a wealth of structural information about many biological processes, such as regulatory networks and pathways, has often been ignored. In order to incorporate such prior network information into the analysis of genomic data, we propose a network-based regularization method for high-dimensional Cox regression; it uses an ℓ1-penalty to induce sparsity of the regression coefficients and a quadratic Laplacian penalty to encourage smoothness between the coefficients of neighboring variables on a given network...
July 2014: Statistica Sinica
Shuang Wu, Hongqi Xue, Yichao Wu, Hulin Wu
In many regression problems, the relations between the covariates and the response may be nonlinear. Motivated by the application of reconstructing a gene regulatory network, we consider a sparse high-dimensional additive model with the additive components being some known nonlinear functions with unknown parameters. To identify the subset of important covariates, we propose a new method for simultaneous variable selection and parameter estimation by iteratively combining a large-scale variable screening (the nonlinear independence screening, NLIS) and a moderate-scale model selection (the nonnegative garrote, NNG) for the nonlinear additive regressions...
July 2014: Statistica Sinica
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"