Read by QxMD icon Read

Computational Statistics & Data Analysis

S Faye Williamson, Peter Jacko, Sofía S Villar, Thomas Jaki
Development of treatments for rare diseases is challenging due to the limited number of patients available for participation. Learning about treatment effectiveness with a view to treat patients in the larger outside population, as in the traditional fixed randomised design, may not be a plausible goal. An alternative goal is to treat the patients within the trial as effectively as possible. Using the framework of finite-horizon Markov decision processes and dynamic programming (DP), a novel randomised response-adaptive design is proposed which maximises the total number of patient successes in the trial and penalises if a minimum number of patients are not recruited to each treatment arm...
September 2017: Computational Statistics & Data Analysis
Shengtong Han, Hongmei Zhang, Wilfried Karmaus, Graham Roberts, Hasan Arshad
Background noise in cluster analyses can potentially mask the true underlying patterns. To tease out patterns uniquely to certain populations, a Bayesian semi-parametric clustering method is presented. It infers and adjusts background noise. The method is built upon a mixture of the Dirichlet process and a point mass function. Simulations demonstrate the effectiveness of the proposed method. The method is then applied to analyze a longitudinal data set on allergic sensitization and asthma status.
May 2017: Computational Statistics & Data Analysis
Seongho Kim, Hyejeong Jang, Imhoi Koo, Joohyoung Lee, Xiang Zhang
Compared to other analytical platforms, comprehensive two-dimensional gas chromatography coupled with mass spectrometry (GC×GC-MS) has much increased separation power for analysis of complex samples and thus is increasingly used in metabolomics for biomarker discovery. However, accurate peak detection remains a bottleneck for wide applications of GC×GC-MS. Therefore, the normal-exponential-Bernoulli (NEB) model is generalized by gamma distribution and a new peak detection algorithm using the normal-gamma-Bernoulli (NGB) model is developed...
January 2017: Computational Statistics & Data Analysis
Jan Gertheiss, Jeff Goldsmith, Ana-Maria Staicu
Non-Gaussian functional data are considered and modeling through functional principal components analysis (FPCA) is discussed. The direct extension of popular FPCA techniques to the generalized case incorrectly uses a marginal mean estimate for a model that has an inherently conditional interpretation, and thus leads to biased estimates of population and subject-level effects. The methods proposed address this shortcoming by using either a two-stage or joint estimation strategy. The performance of all methods is compared numerically in simulations...
January 2017: Computational Statistics & Data Analysis
Daniel Ahfock, Saumyadipta Pyne, Sharon X Lee, Geoffrey J McLachlan
The statistical matching problem involves the integration of multiple datasets where some variables are not observed jointly. This missing data pattern leaves most statistical models unidentifiable. Statistical inference is still possible when operating under the framework of partially identified models, where the goal is to bound the parameters rather than to estimate them precisely. In many matching problems, developing feasible bounds on the parameters is equivalent to finding the set of positive-definite completions of a partially specified covariance matrix...
December 2016: Computational Statistics & Data Analysis
Chenxi Li
Inference for cause-specific hazards from competing risks data under interval censoring and possible left truncation has been understudied. Aiming at this target, a penalized likelihood approach for a Cox-type proportional cause-specific hazards model is developed, and the associated asymptotic theory is discussed. Monte Carlo simulations show that the approach performs very well for moderate sample sizes. An application to a longitudinal study of dementia illustrates the practical utility of the method. In the application, the age-specific hazards of AD, other dementia and death without dementia are estimated, and risk factors of all competing risks are studied...
December 2016: Computational Statistics & Data Analysis
Ling Chen, Jianguo Sun, Chengjie Xiong
Clustered interval-censored failure time data can occur when the failure time of interest is collected from several clusters and known only within certain time intervals. Regression analysis of clustered interval-censored failure time data is discussed assuming that the data arise from the semiparametric additive hazards model. A multiple imputation approach is proposed for inference. A major advantage of the approach is its simplicity because it avoids estimating the correlation within clusters by implementing a resampling-based method...
November 2016: Computational Statistics & Data Analysis
Hao Hu, Yichao Wu, Weixin Yao
Finite mixture models are useful tools and can be estimated via the EM algorithm. A main drawback is the strong parametric assumption about the component densities. In this paper, a much more flexible mixture model is considered, which assumes each component density to be log-concave. Under fairly general conditions, the log-concave maximum likelihood estimator (LCMLE) exists and is consistent. Numeric examples are also made to demonstrate that the LCMLE improves the clustering results while comparing with the traditional MLE for parametric mixture models...
September 2016: Computational Statistics & Data Analysis
Fadlalla G Elfadaly, Paul H Garthwaite, John R Crawford
Mahalanobis distance may be used as a measure of the disparity between an individual's profile of scores and the average profile of a population of controls. The degree to which the individual's profile is unusual can then be equated to the proportion of the population who would have a larger Mahalanobis distance than the individual. Several estimators of this proportion are examined. These include plug-in maximum likelihood estimators, medians, the posterior mean from a Bayesian probability matching prior, an estimator derived from a Taylor expansion, and two forms of polynomial approximation, one based on Bernstein polynomial and one on a quadrature method...
July 2016: Computational Statistics & Data Analysis
Qianchuan He, Linglong Kong, Yanhua Wang, Sijian Wang, Timothy A Chan, Eric Holland
Genetic studies often involve quantitative traits. Identifying genetic features that influence quantitative traits can help to uncover the etiology of diseases. Quantile regression method considers the conditional quantiles of the response variable, and is able to characterize the underlying regression structure in a more comprehensive manner. On the other hand, genetic studies often involve high-dimensional genomic features, and the underlying regression structure may be heterogeneous in terms of both effect sizes and sparsity...
March 2016: Computational Statistics & Data Analysis
Dipankar Bandyopadhyay, M Amalia Jácome
In studies involving nonparametric testing of the equality of two or more survival distributions, the survival curves can exhibit a wide variety of behaviors such as proportional hazards, early/late differences, and crossing hazards. As alternatives to the classical logrank test, the weighted Kaplan-Meier (WKM) type statistic and their variations were developed to handle these situations. However, their applicability is limited to cases where the population membership is available for all observations, including the right censored ones...
March 1, 2016: Computational Statistics & Data Analysis
Cheng Cheng
In large scale genomic analyses dealing with detecting genotype-phenotype associations, such as genome wide association studies (GWAS), it is desirable to have numerically and statistically robust procedures to test the stochastic independence null hypothesis against certain alternatives. Motivated by a special case in a GWAS, a novel test procedure called correlation profile test (CPT) is developed for testing genomic associations with failure-time phenotypes subject to right censoring and competing risks...
March 1, 2016: Computational Statistics & Data Analysis
Joseph Usset, Ana-Maria Staicu, Arnab Maity
A functional regression model with a scalar response and multiple functional predictors is proposed that accommodates two-way interactions in addition to their main effects. The proposed estimation procedure models the main effects using penalized regression splines, and the interaction effect by a tensor product basis. Extensions to generalized linear models and data observed on sparse grids or with measurement error are presented. A hypothesis testing procedure for the functional interaction effect is described...
February 1, 2016: Computational Statistics & Data Analysis
Adam Ciarleglio, R Todd Ogden
Classical finite mixture regression is useful for modeling the relationship between scalar predictors and scalar responses arising from subpopulations defined by the di ering associations between those predictors and responses. The classical finite mixture regression model is extended to incorporate functional predictors by taking a wavelet-based approach in which both the functional predictors and the component-specific coefficient functions are represented in terms of an appropriate wavelet basis. By using the wavelet representation of the model, the coefficients corresponding to the functional covariates become the predictors...
January 1, 2016: Computational Statistics & Data Analysis
Yanqing Sun, Mei Li, Peter B Gilbert
Motivated by the need to assess HIV vaccine efficacy, previous studies proposed an extension of the discrete competing risks proportional hazards model, in which the cause of failure is replaced by a continuous mark only observed at the failure time. However the model assumptions may fail in several ways, and no diagnostic testing procedure for this situation has been proposed. A goodness-of-fit test procedure for the stratified mark-specific proportional hazards model in which the regression parameters depend nonparametrically on the mark and the baseline hazards depends nonparametrically on both time and the mark is proposed...
January 1, 2016: Computational Statistics & Data Analysis
Tong Tong Wu, Kenneth Lange
Matrix completion discriminant analysis (MCDA) is designed for semi-supervised learning where the rate of missingness is high and predictors vastly outnumber cases. MCDA operates by mapping class labels to the vertices of a regular simplex. With c classes, these vertices are arranged on the surface of the unit sphere in c - 1 dimensional Euclidean space. Because all pairs of vertices are equidistant, the classes are treated symmetrically. To assign unlabeled cases to classes, the data is entered into a large matrix (cases along rows and predictors along columns) that is augmented by vertex coordinates stored in the last c - 1 columns...
December 2015: Computational Statistics & Data Analysis
Bruce J Swihart, Naresh M Punjabi, Ciprian M Crainiceanu
Methods are introduced for the analysis of large sets of sleep study data (hypnograms) using a 5-state 20-transition-type structure defined by the American Academy of Sleep Medicine. Application of these methods to the hypnograms of 5598 subjects from the Sleep Heart Health Study provide: the first analysis of sleep hypnogram data of such size and complexity in a community cohort with a range of sleep-disordered breathing severity; introduce a novel approach to compare 5-state (20-transition-type) to 3-state (6-transition-type) sleep structures to assess information loss from combining sleep state categories; extend current approaches of multivariate survival data analysis to clustered, recurrent event discrete-state discrete-time processes; and provide scalable solutions for data analyses required by the case study...
September 2015: Computational Statistics & Data Analysis
Chen Yue, Shaojie Chen, Haris I Sair, Raag Airan, Brian S Caffo
Data reproducibility is a critical issue in all scientific experiments. In this manuscript, the problem of quantifying the reproducibility of graphical measurements is considered. The image intra-class correlation coefficient (I2C2) is generalized and the graphical intra-class correlation coefficient (GICC) is proposed for such purpose. The concept for GICC is based on multivariate probit-linear mixed effect models. A Markov Chain Monte Carlo EM (mcm-cEM) algorithm is used for estimating the GICC. Simulation results with varied settings are demonstrated and our method is applied to the KIRBY21 test-retest dataset...
September 2015: Computational Statistics & Data Analysis
Adrian W Bowman, Stanislav Katina, Joanna Smith, Denise Brown
Methods for capturing images in three dimensions are now widely available, with stereo-photogrammetry and laser scanning being two common approaches. In anatomical studies, a number of landmarks are usually identified manually from each of these images and these form the basis of subsequent statistical analysis. However, landmarks express only a very small proportion of the information available from the images. Anatomically defined curves have the advantage of providing a much richer expression of shape. This is explored in the context of identifying the boundary of breasts from an image of the female torso and the boundary of the lips from a facial image...
June 2015: Computational Statistics & Data Analysis
Hong Zhu, Bo Lu
This article considers the practical problem in clinical and observational studies where multiple treatment or prognostic groups are compared and the observed survival data are subject to right censoring. Two possible formulations of multiple comparisons are suggested. Multiple Comparisons with a Control (MCC) compare every other group to a control group with respect to survival outcomes, for determining which groups are associated with lower risk than the control. Multiple Comparisons with the Best (MCB) compare each group to the truly minimum risk group and identify the groups that are either with the minimum risk or the practically minimum risk...
June 1, 2015: Computational Statistics & Data Analysis
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"