Papers in the journal Electronic Journal of Statistics (Page 2)

#21

JOURNAL ARTICLE

SiAM: A Hybrid of Single Index Models and Additive Models.

Shujie Ma, Heng Lian, Hua Liang, Raymond J Carroll

While popular, single index models and additive models have potential limitations, a fact that leads us to propose SiAM, a novel hybrid combination of these two models. We first address model identifiability under general assumptions. The result is of independent interest. We then develop an estimation procedure by using splines to approximate unknown functions and establish the asymptotic properties of the resulting estimators. Furthermore, we suggest a two-step procedure for establishing confidence bands for the nonparametric additive functions...

29104711

2017: Electronic Journal of Statistics

#22

JOURNAL ARTICLE

Estimation and inference of error-prone covariate effect in the presence of confounding variables.

Jianxuan Liu, Yanyuan Ma, Liping Zhu, Raymond J Carroll

We introduce a general single index semiparametric measurement error model for the case that the main covariate of interest is measured with error and modeled parametrically, and where there are many other variables also important to the modeling. We propose a semiparametric bias-correction approach to estimate the effect of the covariate of interest. The resultant estimators are shown to be root-n consistent, asymptotically normal and locally efficient. Comprehensive simulations and an analysis of an empirical data set are performed to demonstrate the finite sample performance and the bias reduction of the locally efficient estimators...

28983388

2017: Electronic Journal of Statistics

#23

JOURNAL ARTICLE

Semiparametric Single-Index Model for Estimating Optimal Individualized Treatment Strategy.

Rui Song, Shikai Luo, Donglin Zeng, Hao Helen Zhang, Wenbin Lu, Zhiguo Li

Different from the standard treatment discovery framework which is used for finding single treatments for a homogenous group of patients, personalized medicine involves finding therapies that are tailored to each individual in a heterogeneous group. In this paper, we propose a new semiparametric additive single-index model for estimating individualized treatment strategy. The model assumes a flexible and nonparametric link function for the interaction between treatment and predictive covariates. We estimate the rule via monotone B-splines and establish the asymptotic properties of the estimators...

28959371

2017: Electronic Journal of Statistics

#24

JOURNAL ARTICLE

Nearly assumptionless screening for the mutually-exciting multivariate Hawkes process.

Shizhe Chen, Daniela Witten, Ali Shojaie

We consider the task of learning the structure of the graph underlying a mutually-exciting multivariate Hawkes process in the high-dimensional setting. We propose a simple and computationally inexpensive edge screening approach. Under a subset of the assumptions required for penalized estimation approaches to recover the graph, this edge screening approach has the sure screening property: with high probability, the screened edge set is a superset of the true edge set. Furthermore, the screened edge set is relatively small...

28845209

2017: Electronic Journal of Statistics

#25

JOURNAL ARTICLE

Scalable Bayesian nonparametric measures for exploring pairwise dependence via Dirichlet Process Mixtures.

Sarah Filippi, Chris C Holmes, Luis E Nieto-Barajas

In this article we propose novel Bayesian nonparametric methods using Dirichlet Process Mixture (DPM) models for detecting pairwise dependence between random variables while accounting for uncertainty in the form of the underlying distributions. A key criteria is that the procedures should scale to large data sets. In this regard we find that the formal calculation of the Bayes factor for a dependent-vs.-independent DPM joint probability measure is not feasible computationally. To address this we present Bayesian diagnostic measures for characterising evidence against a "null model" of pairwise independence...

29707100

November 16, 2016: Electronic Journal of Statistics

#26

JOURNAL ARTICLE

Scalable Bayesian nonparametric regression via a Plackett-Luce model for conditional ranks.

Tristan Gray-Davies, Chris C Holmes, François Caron

We present a novel Bayesian nonparametric regression model for covariates X and continuous response variable Y ∈ ℝ. The model is parametrized in terms of marginal distributions for Y and X and a regression function which tunes the stochastic ordering of the conditional distributions F ( y|x ). By adopting an approximate composite likelihood approach, we show that the resulting posterior inference can be decoupled for the separate components of the model. This procedure can scale to very large datasets and allows for the use of standard, existing, software from Bayesian nonparametric density estimation and Plackett-Luce ranking estimation to be applied...

29623150

July 18, 2016: Electronic Journal of Statistics

#27

JOURNAL ARTICLE

Empirical likelihood based tests for stochastic ordering under right censorship.

Hsin-Wen Chang, Ian W McKeague

This paper develops an empirical likelihood approach to testing for stochastic ordering between two univariate distributions under right censorship. The proposed test is based on a maximally selected local empirical likelihood statistic. The asymptotic null distribution is expressed in terms of a Brownian bridge. The new procedure is shown via a simulation study to have superior power to the log-rank and weighted Kaplan-Meier tests under crossing hazard alternatives. The approach is illustrated using data from a randomized clinical trial involving the treatment of severe alcoholic hepatitis...

31178947

2016: Electronic Journal of Statistics

#28

JOURNAL ARTICLE

Designing penalty functions in high dimensional problems: The role of tuning parameters.

Ting-Huei Chen, Wei Sun, Jason P Fine

Various forms of penalty functions have been developed for regularized estimation and variable selection. Screening approaches are often used to reduce the number of covariate before penalized estimation. However, in certain problems, the number of covariates remains large after screening. For example, in genome-wide association (GWA) studies, the purpose is to identify Single Nucleotide Polymorphisms (SNPs) that are associated with certain traits, and typically there are millions of SNPs and thousands of samples...

28989558

2016: Electronic Journal of Statistics

#29

JOURNAL ARTICLE

Estimation of multiple networks in Gaussian mixture models.

Chen Gao, Yunzhang Zhu, Xiaotong Shen, Wei Pan

We aim to estimate multiple networks in the presence of sample heterogeneity, where the independent samples (i.e. observations) may come from different and unknown populations or distributions. Specifically, we consider penalized estimation of multiple precision matrices in the framework of a Gaussian mixture model. A major innovation is to take advantage of the commonalities across the multiple precision matrices through possibly nonconvex fusion regularization, which for example makes it possible to achieve simultaneous discovery of unknown disease subtypes and detection of differential gene (dys)regulations in functional genomics...

28966702

2016: Electronic Journal of Statistics

#30

JOURNAL ARTICLE

Robust learning for optimal treatment decision with NP-dimensionality.

Chengchun Shi, Rui Song, Wenbin Lu

In order to identify important variables that are involved in making optimal treatment decision, Lu, Zhang and Zeng (2013) proposed a penalized least squared regression framework for a fixed number of predictors, which is robust against the misspecification of the conditional mean model. Two problems arise: (i) in a world of explosively big data, effective methods are needed to handle ultra-high dimensional data set, for example, with the dimension of predictors is of the non-polynomial (NP) order of the sample size; (ii) both the propensity score and conditional mean models need to be estimated from data under NP dimensionality...

28781717

2016: Electronic Journal of Statistics

#31

JOURNAL ARTICLE

Estimation of High-Dimensional Graphical Models Using Regularized Score Matching.

Lina Lin, Mathias Drton, Ali Shojaie

Graphical models are widely used to model stochastic dependences among large collections of variables. We introduce a new method of estimating undirected conditional independence graphs based on the score matching loss, introduced by Hyvärinen (2005), and subsequently extended in Hyvärinen (2007). The regularized score matching method we propose applies to settings with continuous observations and allows for computationally efficient treatment of possibly non-Gaussian exponential family models. In the well-explored Gaussian setting, regularized score matching avoids issues of asymmetry that arise when applying the technique of neighborhood selection, and compared to existing methods that directly yield symmetric estimates, the score matching approach has the advantage that the considered loss is quadratic and gives piecewise linear solution paths under ℓ1 regularization...

28638498

2016: Electronic Journal of Statistics

#32

JOURNAL ARTICLE

On convex least squares estimation when the truth is linear.

Yining Chen, Jon A Wellner

We prove that the convex least squares estimator (LSE) attains a n(-1/2) pointwise rate of convergence in any region where the truth is linear. In addition, the asymptotic distribution can be characterized by a modified invelope process. Analogous results hold when one uses the derivative of the convex LSE to perform derivative estimation. These asymptotic results facilitate a new consistent testing procedure on the linearity against a convex alternative. Moreover, we show that the convex LSE adapts to the optimal rate at the boundary points of the region where the truth is linear, up to a log-log factor...

28503251

2016: Electronic Journal of Statistics

#33

JOURNAL ARTICLE

Joint Estimation of Precision Matrices in Heterogeneous Populations.

Takumi Saegusa, Ali Shojaie

We introduce a general framework for estimation of inverse covariance, or precision, matrices from heterogeneous populations. The proposed framework uses a Laplacian shrinkage penalty to encourage similarity among estimates from disparate, but related, subpopulations, while allowing for differences among matrices. We propose an efficient alternating direction method of multipliers (ADMM) algorithm for parameter estimation, as well as its extension for faster computation in high dimensions by thresholding the empirical covariance matrix to identify the joint block diagonal structure in the estimated precision matrices...

28473876

2016: Electronic Journal of Statistics

#34

Statistical properties of convex clustering.

Kean Ming Tan, Daniela Witten

In this manuscript, we study the statistical properties of convex clustering. We establish that convex clustering is closely related to single linkage hierarchical clustering and k-means clustering. In addition, we derive the range of the tuning parameter for convex clustering that yields a non-trivial solution. We also provide an unbiased estimator of the degrees of freedom, and provide a finite sample bound for the prediction error for convex clustering. We compare convex clustering to some traditional clustering methods in simulation studies...

27617051

2015: Electronic Journal of Statistics

#35

A test of homogeneity for age-dependent branching processes with immigration.

Ollivier Hyrien, Nikolay M Yanev, Craig T Jordan

We propose a novel procedure to test whether the immigration process of a discretely observed age-dependent branching process with immigration is time-homogeneous. The construction of the test is motivated by the behavior of the coefficient of variation of the population size. When immigration is time-homogeneous, we find that this coefficient converges to a constant, whereas when immigration is time-inhomogeneous we find that it is time-dependent, at least transiently. Thus, we test the assumption that the immigration process is time-homogeneous by verifying that the sample coefficient of variation does not vary significantly over time...

27134694

2015: Electronic Journal of Statistics

#36

Computationally efficient confidence intervals for cross-validated area under the ROC curve estimates.

Erin LeDell, Maya Petersen, Mark van der Laan

In binary classification problems, the area under the ROC curve (AUC) is commonly used to evaluate the performance of a prediction model. Often, it is combined with cross-validation in order to assess how the results will generalize to an independent data set. In order to evaluate the quality of an estimate for cross-validated AUC, we obtain an estimate of its variance. For massive data sets, the process of generating a single performance estimate can be computationally expensive. Additionally, when using a complex prediction method, the process of cross-validating a predictive model on even a relatively small data set can still require a large amount of computation time...

26279737

2015: Electronic Journal of Statistics

#37

Bootstrapping a change-point Cox model for survival data.

Gongjun Xu, Bodhisattva Sen, Zhiliang Ying

This paper investigates the (in)-consistency of various bootstrap methods for making inference on a change-point in time in the Cox model with right censored survival data. A criterion is established for the consistency of any bootstrap method. It is shown that the usual nonparametric bootstrap is inconsistent for the maximum partial likelihood estimation of the change-point. A new model-based bootstrap approach is proposed and its consistency established. Simulation studies are carried out to assess the performance of various bootstrap schemes...

25400719

August 20, 2014: Electronic Journal of Statistics

#38

Estimating hidden population size using Respondent-Driven Sampling data.

Mark S Handcock, Krista J Gile, Corinne M Mar

Respondent-Driven Sampling (RDS) is n approach to sampling design and inference in hard-to-reach human populations. It is often used in situations where the target population is rare and/or stigmatized in the larger population, so that it is prohibitively expensive to contact them through the available frames. Common examples include injecting drug users, men who have sex with men, and female sex workers. Most analysis of RDS data has focused on estimating aggregate characteristics, such as disease prevalence...

26180577

2014: Electronic Journal of Statistics

#39

Comment on "Dynamic treatment regimes: technical challenges and applications"

Yair Goldberg, Rui Song, Donglin Zeng, Michael R Kosorok

Inference for parameters associated with optimal dynamic treatment regimes is challenging as these estimators are nonregular when there are non-responders to treatments. In this discussion, we comment on three aspects of alleviating this nonregularity. We first discuss an alternative approach for smoothing the quality functions. We then discuss some further details on our existing work to identify non-responders through penalization. Third, we propose a clinically meaningful value assessment whose estimator does not suffer from nonregularity...

25485028

2014: Electronic Journal of Statistics

#40

Dynamic treatment regimes: technical challenges and applications.

Eric B Laber, Daniel J Lizotte, Min Qian, William E Pelham, Susan A Murphy

Dynamic treatment regimes are of growing interest across the clinical sciences because these regimes provide one way to operationalize and thus inform sequential personalized clinical decision making. Formally, a dynamic treatment regime is a sequence of decision rules, one per stage of clinical intervention. Each decision rule maps up-to-date patient information to a recommended treatment. We briefly review a variety of approaches for using data to construct the decision rules. We then review a critical inferential challenge that results from nonregularity, which often arises in this area...

25356091

2014: Electronic Journal of Statistics

Use the journals feature with a free QxMD account.

Electronic Journal of Statistics

Save your favorite articles in one place with a free QxMD account.

Read

Search Tips