Read by QxMD icon Read

Annals of Statistics

Sayan Dasgupta, Yair Goldberg, Michael R Kosorok
We develop an approach for feature elimination in statistical learning with kernel machines, based on recursive elimination of features. We present theoretical properties of this method and show that it is uniformly consistent in finding the correct feature space under certain generalized assumptions. We present a few case studies to show that the assumptions are met in most practical situations and present simulation results to demonstrate performance of the proposed approach.
February 2019: Annals of Statistics
Junlong Zhao, Guan Yu, Yufeng Liu
Robustness is a desirable property for many statistical techniques. As an important measure of robustness, breakdown point has been widely used for regression problems and many other settings. Despite the existing development, we observe that the standard breakdown point criterion is not directly applicable for many classification problems. In this paper, we propose a new breakdown point criterion, namely angular breakdown point, to better quantify the robustness of different classification methods. Using this new breakdown point criterion, we study the robustness of binary large margin classification techniques, although the idea is applicable to general classification methods...
December 2018: Annals of Statistics
Wen-Xin Zhou, Koushiki Bose, Jianqing Fan, Han Liu
Heavy-tailed errors impair the accuracy of the least squares estimate, which can be spoiled by a single grossly outlying observation. As argued in the seminal work of Peter Huber in 1973 [ Ann. Statist. 1 (1973) 799-821], robust alternatives to the method of least squares are sorely needed. To achieve robustness against heavy-tailed sampling distributions, we revisit the Huber estimator from a new perspective by letting the tuning parameter involved diverge with the sample size. In this paper, we develop nonasymptotic concentration results for such an adaptive Huber estimator, namely, the Huber estimator with the tuning parameter adapted to sample size, dimension, and the variance of the noise...
October 2018: Annals of Statistics
Vu Dinh, Lam Si Tung Ho, Marc A Suchard, Frederick A Matsen
It is common in phylogenetics to have some, perhaps partial, information about the overall evolutionary tree of a group of organisms and wish to find an evolutionary tree of a specific gene for those organisms. There may not be enough information in the gene sequences alone to accurately reconstruct the correct "gene tree." Although the gene tree may deviate from the "species tree" due to a variety of genetic processes, in the absence of evidence to the contrary it is parsimonious to assume that they agree...
August 2018: Annals of Statistics
David L Donoho, Matan Gavish, Iain M Johnstone
We show that in a common high-dimensional covariance model, the choice of loss function has a profound effect on optimal estimation. In an asymptotic framework based on the Spiked Covariance model and use of orthogonally invariant estimators, we show that optimal estimation of the population covariance matrix boils down to design of an optimal shrinker η that acts elementwise on the sample eigenvalues. Indeed, to each loss function there corresponds a unique admissible eigenvalue shrinker η * dominating all other shrinkers...
August 2018: Annals of Statistics
Jianqing Fan, Han Liu, Weichen Wang
We propose a general Principal Orthogonal complEment Thresholding (POET) framework for large-scale covariance matrix estimation based on the approximate factor model. A set of high level sufficient conditions for the procedure to achieve optimal rates of convergence under different matrix norms is established to better understand how POET works. Such a framework allows us to recover existing results for sub-Gaussian data in a more transparent way that only depends on the concentration properties of the sample covariance matrix...
August 2018: Annals of Statistics
Wenliang Pan, Yuan Tian, Xueqin Wang, Heping Zhang
In this paper, we first introduce Ball Divergence, a novel measure of the difference between two probability measures in separable Banach spaces, and show that the Ball Divergence of two probability measures is zero if and only if these two probability measures are identical without any moment assumption. Using Ball Divergence, we present a metric rank test procedure to detect the equality of distribution measures underlying independent samples. It is therefore robust to outliers or heavy-tail data. We show that this multivariate two sample test statistic is consistent with the Ball Divergence, and it converges to a mixture of χ2 distributions under the null hypothesis and a normal distribution under the alternative hypothesis...
June 2018: Annals of Statistics
Heather Battey, Jianqing Fan, Han Liu, Junwei Lu, Ziwei Zhu
This paper studies hypothesis testing and parameter estimation in the context of the divide-and-conquer algorithm. In a unified likelihood based framework, we propose new test statistics and point estimators obtained by aggregating various statistics from k subsamples of size n/k , where n is the sample size. In both low dimensional and sparse high dimensional settings, we address the important question of how large k can be, as n grows large, such that the loss of efficiency due to the divide-and-conquer algorithm is negligible...
June 2018: Annals of Statistics
Jianqing Fan, Qi-Man Shao, Wen-Xin Zhou
Over the last two decades, many exciting variable selection methods have been developed for finding a small group of covariates that are associated with the response from a large pool. Can the discoveries by such data mining approaches be spurious due to high dimensionality and limited sample size? Can our fundamental assumptions on exogeneity of covariates needed for such variable selection be validated with the data? To answer these questions, we need to derive the distributions of the maximum spurious correlations given certain number of predictors, namely, the distribution of the correlation of a response variable Y with the best s linear combinations of p covariates X , even when X and Y are independent...
June 2018: Annals of Statistics
Chengchun Shi, Alin Fan, Rui Song, Wenbin Lu
Precision medicine is a medical paradigm that focuses on finding the most effective treatment decision based on individual patient information. For many complex diseases, such as cancer, treatment decisions need to be tailored over time according to patients' responses to previous treatments. Such an adaptive strategy is referred as a dynamic treatment regime. A major challenge in deriving an optimal dynamic treatment regime arises when an extraordinary large number of prognostic factors, such as patient's genetic information, demographic characteristics, medical history and clinical measurements over time are available, but not all of them are necessary for making treatment decision...
June 2018: Annals of Statistics
Jianqing Fan, Han Liu, Qiang Sun, Tong Zhang
We propose a computational framework named iterative local adaptive majorize-minimization (I-LAMM) to simultaneously control algorithmic complexity and statistical error when fitting high dimensional models. I-LAMM is a two-stage algorithmic implementation of the local linear approximation to a family of folded concave penalized quasi-likelihood. The first stage solves a convex program with a crude precision tolerance to obtain a coarse initial estimator, which is further refined in the second stage by iteratively solving a sequence of convex programs with smaller precision tolerances...
April 2018: Annals of Statistics
Qi Zheng, Limin Peng, Xuming He
Censored quantile regression (CQR) has emerged as a useful regression tool for survival analysis. Some commonly used CQR methods can be characterized by stochastic integral-based estimating equations in a sequential manner across quantile levels. In this paper, we analyze CQR in a high dimensional setting where the regression functions over a continuum of quantile levels are of interest. We propose a two-step penalization procedure, which accommodates stochastic integral based estimating equations and address the challenges due to the recursive nature of the procedure...
February 2018: Annals of Statistics
Xiaoou Li, Jingchen Liu, Zhiliang Ying
The asymptotic efficiency of a generalized likelihood ratio test proposed by Cox is studied under the large deviations framework for error probabilities developed by Chernoff. In particular, two separate parametric families of hypotheses are considered (Cox, 1961, 1962). The significance level is set such that the maximal type I and type II error probabilities for the generalized likelihood ratio test decay exponentially fast with the same rate. We derive the analytic form of such a rate that is also known as the Chernoff index (Chernoff, 1952), a relative efficiency measure when there is no preference between the null and the alternative hypotheses...
February 2018: Annals of Statistics
Kwun Chuen Gary Chan, Hok Kan Ling, Tony Sit, Sheung Chi Phillip Yam
We study the nonparametric estimation of a decreasing density function g 0 in a general s -sample biased sampling model with weight (or bias) functions wi for i = 1, …, s . The determination of the monotone maximum likelihood estimator ĝn and its asymptotic distribution, except for the case when s = 1, has been long missing in the literature due to certain non-standard structures of the likelihood function, such as non-separability and a lack of strictly positive second order derivatives of the negative of the log-likelihood function...
2018: Annals of Statistics
Weichen Wang, Jianqing Fan
We derive the asymptotic distributions of the spiked eigenvalues and eigenvectors under a generalized and unified asymptotic regime, which takes into account the magnitude of spiked eigenvalues, sample size, and dimensionality. This regime allows high dimensionality and diverging eigenvalues and provides new insights into the roles that the leading eigenvalues, sample size, and dimensionality play in principal component analysis. Our results are a natural extension of those in Paul (2007) to a more general setting and solve the rates of convergence problems in Shen et al...
June 2017: Annals of Statistics
Judith J Lok
In observational studies, treatment may be adapted to covariates at several times without a fixed protocol, in continuous time. Treatment influences covariates, which influence treatment, which influences covariates, and so on. Then even time-dependent Cox-models cannot be used to estimate the net treatment effect. Structural nested models have been applied in this setting. Structural nested models are based on counterfactuals: the outcome a person would have had had treatment been withheld after a certain time...
April 2017: Annals of Statistics
Antoine Chambaz, Wenjing Zheng, Mark J van der Laan
This article studies the targeted sequential inference of an optimal treatment rule (TR) and its mean reward in the non-exceptional case, i.e. , assuming that there is no stratum of the baseline covariates where treatment is neither beneficial nor harmful, and under a companion margin assumption. Our pivotal estimator, whose definition hinges on the targeted minimum loss estimation (TMLE) principle, actually infers the mean reward under the current estimate of the optimal TR. This data-adaptive statistical parameter is worthy of interest on its own...
2017: Annals of Statistics
Chuan-Fa Tang, Dewei Wang, Joshua M Tebbs
We propose Lp distance-based goodness-of-fit (GOF) tests for uniform stochastic ordering with two continuous distributions F and G , both of which are unknown. Our tests are motivated by the fact that when F and G are uniformly stochastically ordered, the ordinal dominance curve R = FG -1 is star-shaped. We derive asymptotic distributions and prove that our testing procedure has a unique least favorable configuration of F and G for p ∈ [1,∞]. We use simulation to assess finite-sample performance and demonstrate that a modified, one-sample version of our procedure (e...
2017: Annals of Statistics
James E Johndrow, Anirban Bhattacharya, David B Dunson
Contingency table analysis routinely relies on log-linear models, with latent structure analysis providing a common alternative. Latent structure models lead to a reduced rank tensor factorization of the probability mass function for multivariate categorical data, while log-linear models achieve dimensionality reduction through sparsity. Little is known about the relationship between these notions of dimensionality reduction in the two paradigms. We derive several results relating the support of a log-linear model to nonnegative ranks of the associated probability tensor...
2017: Annals of Statistics
Ilya Shpitser, Eric Tchetgen Tchetgen
Identifying causal parameters from observational data is fraught with subtleties due to the issues of selection bias and confounding. In addition, more complex questions of interest, such as effects of treatment on the treated and mediated effects may not always be identified even in data where treatment assignment is known and under investigator control, or may be identified under one causal model but not another. Increasingly complex effects of interest, coupled with a diversity of causal models in use resulted in a fragmented view of identification...
December 2016: Annals of Statistics
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"