Read by QxMD icon Read

Journal of Computational and Graphical Statistics

J T Gaskins, M J Daniels
The estimation of the covariance matrix is a key concern in the analysis of longitudinal data. When data consists of multiple groups, it is often assumed the covariance matrices are either equal across groups or are completely distinct. We seek methodology to allow borrowing of strength across potentially similar groups to improve estimation. To that end, we introduce a covariance partition prior which proposes a partition of the groups at each measurement time. Groups in the same set of the partition share dependence parameters for the distribution of the current measurement given the preceding ones, and the sequence of partitions is modeled as a Markov chain to encourage similar structure at nearby measurement times...
January 2, 2016: Journal of Computational and Graphical Statistics
Chong Zhang, Yufeng Liu, Junhui Wang, Hongtu Zhu
The Support Vector Machine (SVM) is a very popular classification tool with many successful applications. It was originally designed for binary problems with desirable theoretical properties. Although there exist various Multicategory SVM (MSVM) extensions in the literature, some challenges remain. In particular, most existing MSVMs make use of k classification functions for a k-class problem, and the corresponding optimization problems are typically handled by existing quadratic programming solvers. In this paper, we propose a new group of MSVMs, namely the Reinforced Angle-based MSVMs (RAMSVMs), using an angle-based prediction rule with k - 1 functions directly...
2016: Journal of Computational and Graphical Statistics
Lizhen Xu, Radu V Craiu, Lei Sun, Andrew D Paterson
Motivated by genetic association studies of pleiotropy, we propose a Bayesian latent variable approach to jointly study multiple outcomes. The models studied here can incorporate both continuous and binary responses, and can account for serial and cluster correlations. We consider Bayesian estimation for the model parameters, and we develop a novel MCMC algorithm that builds upon hierarchical centering and parameter expansion techniques to efficiently sample from the posterior distribution. We evaluate the proposed method via extensive simulations and demonstrate its utility with an application to aa association study of various complication outcomes related to type 1 diabetes...
2016: Journal of Computational and Graphical Statistics
Bruce D Bugbee, F Jay Breidt, Mark J van der Woerd
Variational approximations provide fast, deterministic alternatives to Markov Chain Monte Carlo for Bayesian inference on the parameters of complex, hierarchical models. Variational approximations are often limited in practicality in the absence of conjugate posterior distributions. Recent work has focused on the application of variational methods to models with only partial conjugacy, such as in semiparametric regression with heteroskedastic errors. Here, both the mean and log variance functions are modeled as smooth functions of covariates...
2016: Journal of Computational and Graphical Statistics
Leo L Duan, John P Clancy, Rhonda D Szczesniak
We propose a novel "tree-averaging" model that utilizes the ensemble of classification and regression trees (CART). Each constituent tree is estimated with a subset of similar data. We treat this grouping of subsets as Bayesian Ensemble Trees (BET) and model them as a Dirichlet process. We show that BET determines the optimal number of trees by adapting to the data heterogeneity. Compared with the other ensemble methods, BET requires much fewer trees and shows equivalent prediction accuracy using weighted averaging...
2016: Journal of Computational and Graphical Statistics
Hui Jiang, John Chong Mu, Kun Yang, Chao Du, Luo Lu, Wing Hung Wong
Optional Pólya tree (OPT) is a flexible nonparametric Bayesian prior for density estimation. Despite its merits, the computation for OPT inference is challenging. In this paper we present time complexity analysis for OPT inference and propose two algorithmic improvements. The first improvement, named limited-lookahead optional Pólya tree (LL-OPT), aims at accelerating the computation for OPT inference. The second improvement modifies the output of OPT or LL-OPT and produces a continuous piecewise linear density estimate...
2016: Journal of Computational and Graphical Statistics
Hyonho Chun, Xianghua Zhang, Hongyu Zhao
Revealing biological networks is one key objective in systems biology. With microarrays, researchers now routinely measure expression profiles at the genome level under various conditions, and, such data may be utilized to statistically infer gene regulation networks. Gaussian graphical models (GGMs) have proven useful for this purpose by modeling the Markovian dependence among genes. However, a single GGM may not be adequate to describe the potentially differing networks across various conditions, and hence it is more natural to infer multiple GGMs from such data...
October 1, 2015: Journal of Computational and Graphical Statistics
John Hughes
Non-Gaussian spatial data are common in many fields. When fitting regressions for such data, one needs to account for spatial dependence to ensure reliable inference for the regression coefficients. The two most commonly used regression models for spatially aggregated data are the automodel and the areal generalized linear mixed model (GLMM). These models induce spatial dependence in different ways but share the smoothing approach, which is intuitive but problematic. This article develops a new regression model for areal data...
September 16, 2015: Journal of Computational and Graphical Statistics
Wei Xiao, Yichao Wu, Hua Zhou
The least angle regression (LAR) was proposed by Efron, Hastie, Johnstone and Tibshirani (2004) for continuous model selection in linear regression. It is motivated by a geometric argument and tracks a path along which the predictors enter successively and the active predictors always maintain the same absolute correlation (angle) with the residual vector. Although it gains popularity quickly, its extensions seem rare compared to the penalty methods. In this expository article, we show that the powerful geometric idea of LAR can be generalized in a fruitful way...
July 1, 2015: Journal of Computational and Graphical Statistics
Michael Salter-Townshend, Thomas Brendan Murphy
A novel and flexible framework for investigating the roles of actors within a network is introduced. Particular interest is in roles as defined by local network connectivity patterns, identified using the ego-networks extracted from the network. A mixture of Exponential-family Random Graph Models is developed for these ego-networks in order to cluster the nodes into roles. We refer to this model as the ego-ERGM. An Expectation-Maximization algorithm is developed to infer the unobserved cluster assignments and to estimate the mixture model parameters using a maximum pseudo-likelihood approximation...
June 1, 2015: Journal of Computational and Graphical Statistics
Jianhua Hu, Peng Wang, Annie Qu
Identifying correlation structure is important to achieving estimation efficiency in analyzing longitudinal data, and is also crucial for drawing valid statistical inference for large size clustered data. In this paper, we propose a nonparametric method to estimate the correlation structure, which is applicable for discrete longitudinal data. We utilize eigenvector-based basis matrices to approximate the inverse of the empirical correlation matrix and determine the number of basis matrices via model selection...
April 1, 2015: Journal of Computational and Graphical Statistics
Fabian Scheipl, Ana-Maria Staicu, Sonja Greven
We propose an extensive framework for additive regression models for correlated functional responses, allowing for multiple partially nested or crossed functional random effects with flexible correlation structures for, e.g., spatial, temporal, or longitudinal functional data. Additionally, our framework includes linear and nonlinear effects of functional and scalar covariates that may vary smoothly over the index of the functional response. It accommodates densely or sparsely observed functional responses and predictors which may be observed with additional error and includes both spline-based and functional principal component-based terms...
April 1, 2015: Journal of Computational and Graphical Statistics
Xin Qi, Ruiyan Luo, Raymond J Carroll, Hongyu Zhao
Recent years have seen active developments of various penalized regression methods, such as LASSO and elastic net, to analyze high dimensional data. In these approaches, the direction and length of the regression coefficients are determined simultaneously. Due to the introduction of penalties, the length of the estimates can be far from being optimal for accurate predictions. We introduce a new framework, regression by projection, and its sparse version to analyze high dimensional data. The unique nature of this framework is that the directions of the regression coefficients are inferred first, and the lengths and the tuning parameters are determined by a cross validation procedure to achieve the largest prediction accuracy...
April 1, 2015: Journal of Computational and Graphical Statistics
Shiwei Lan, Vasileios Stathopoulos, Babak Shahbaba, Mark Girolami
Hamiltonian Monte Carlo (HMC) improves the computational e ciency of the Metropolis-Hastings algorithm by reducing its random walk behavior. Riemannian HMC (RHMC) further improves the performance of HMC by exploiting the geometric properties of the parameter space. However, the geometric integrator used for RHMC involves implicit equations that require fixed-point iterations. In some cases, the computational overhead for solving implicit equations undermines RHMC's benefits. In an attempt to circumvent this problem, we propose an explicit integrator that replaces the momentum variable in RHMC by velocity...
April 1, 2015: Journal of Computational and Graphical Statistics
Jian Guo, Elizaveta Levina, George Michailidis, Ji Zhu
A graphical model for ordinal variables is considered, where it is assumed that the data are generated by discretizing the marginal distributions of a latent multivariate Gaussian distribution. The relationships between these ordinal variables are then described by the underlying Gaussian graphical model and can be inferred by estimating the corresponding concentration matrix. Direct estimation of the model is computationally expensive, but an approximate EM-like algorithm is developed to provide an accurate estimate of the parameters at a fraction of the computational cost...
March 31, 2015: Journal of Computational and Graphical Statistics
Eric C Chi, Kenneth Lange
Clustering is a fundamental problem in many scientific applications. Standard methods such as k-means, Gaussian mixture models, and hierarchical clustering, however, are beset by local minima, which are sometimes drastically suboptimal. Recently introduced convex relaxations of k-means and hierarchical clustering shrink cluster centroids toward one another and ensure a unique global minimizer. In this work we present two splitting methods for solving the convex clustering problem. The first is an instance of the alternating direction method of multipliers (ADMM); the second is an instance of the alternating minimization algorithm (AMA)...
2015: Journal of Computational and Graphical Statistics
Michael Lim, Trevor Hastie
We introduce a method for learning pairwise interactions in a linear regression or logistic regression model in a manner that satisfies strong hierarchy: whenever an interaction is estimated to be nonzero, both its associated main effects are also included in the model. We motivate our approach by modeling pairwise interactions for categorical variables with arbitrary numbers of levels, and then show how we can accommodate continuous variables as well. Our approach allows us to dispense with explicitly applying constraints on the main effects and interactions for identifiability, which results in interpretable interaction models...
2015: Journal of Computational and Graphical Statistics
Hanwen Huang, Yufeng Liu, Ming Yuan, J S Marron
Clustering methods have led to a number of important discoveries in bioinformatics and beyond. A major challenge in their use is determining which clusters represent important underlying structure, as opposed to spurious sampling artifacts. This challenge is especially serious, and very few methods are available, when the data are very high in dimension. Statistical Significance of Clustering (SigClust) is a recently developed cluster evaluation tool for high dimensional low sample size data. An important component of the SigClust approach is the very definition of a single cluster as a subset of data sampled from a multivariate Gaussian distribution...
2015: Journal of Computational and Graphical Statistics
Nicole Bohme Carnegie, Pavel N Krivitsky, David R Hunter, Steven M Goodreau
There has been a great deal of interest recently in the modeling and simulation of dynamic networks, i.e., networks that change over time. One promising model is the separable temporal exponential-family random graph model (ERGM) of Krivitsky and Handcock, which treats the formation and dissolution of ties in parallel at each time step as independent ERGMs. However, the computational cost of fitting these models can be substantial, particularly for large, sparse networks. Fitting cross-sectional models for observations of a network at a single point in time, while still a non-negligible computational burden, is much easier...
2015: Journal of Computational and Graphical Statistics
Ruixin Guo, Mihye Ahn, Hongtu Zhu
The aim of this paper is to develop a supervised dimension reduction framework, called Spatially Weighted Principal Component Analysis (SWPCA), for high dimensional imaging classification. Two main challenges in imaging classification are the high dimensionality of the feature space and the complex spatial structure of imaging data. In SWPCA, we introduce two sets of novel weights including global and local spatial weights, which enable a selective treatment of individual features and incorporation of the spatial structure of imaging data and class label information...
January 2015: Journal of Computational and Graphical Statistics
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"