Read by QxMD icon Read


Marla Johnson, Elizabeth Purdom
Sequencing of messenger RNA (mRNA) can provide estimates of the levels of individual isoforms within the cell. It remains to adapt many standard statistical methods commonly used for analyzing gene expression levels to take advantage of this additional information. One novel question is whether we can find clusters of samples that are distinguished not by their gene expression but by their isoform usage. We propose a novel approach for clustering mRNA-Seq data that identifies such clusters. We show via simulation that our methods are more sensitive to finding clusters based on isoform usage than standard clustering techniques...
October 25, 2016: Biostatistics
F Towfic, R Kusko, B Zeskind
The article by Nygaard et al proposes that applying batch correction approaches to microarray data from studies with unbalanced designs may inadvertently exaggerate the differences observed. In seeking to illustrate their point, Nygaard et al. utilized a dataset (GSE61901) from a study we published (Towfic and others, 2014) and showed that one analysis pipeline utilizing the traditional approach to batch correction (ComBat) yielded over 1000 differentially expressed probesets, while an alternative approach proposed by Nygaard et al (utilizing batch as a fixed effect and averaging technical replicates) recovered 11 differentially expressed probesets...
October 25, 2016: Biostatistics
Matthew Stephens
SummaryWe introduce a new Empirical Bayes approach for large-scale hypothesis testing, including estimating false discovery rates (FDRs), and effect sizes. This approach has two key differences from existing approaches to FDR analysis. First, it assumes that the distribution of the actual (unobserved) effects is unimodal, with a mode at 0. This "unimodal assumption" (UA), although natural in many contexts, is not usually incorporated into standard FDR analysis, and we demonstrate how incorporating it brings many benefits...
October 17, 2016: Biostatistics
Glenn Heller, Venkatraman E Seshan, Chaya S Moskowitz, Mithat Gönen
SummaryThe area under the curve (AUC) statistic is a common measure of model performance in a binary regression model. Nested models are used to ascertain whether the AUC statistic increases when new factors enter the model. The regression coefficient estimates used in the AUC statistics are computed using the maximum rank correlation methodology. Typically, inference for the difference in AUC statistics from nested models is derived under asymptotic normality. In this work, it is demonstrated that the asymptotic normality is true only when at least one of the new factors is associated with the binary outcome...
September 21, 2016: Biostatistics
Emilio Carrizosa, Alba V Olivares-Nadal, Pepa Ramírez-Cobo
SummaryVector autoregressive (VAR) models constitute a powerful and well studied tool to analyze multivariate time series. Since sparseness, crucial to identify and visualize joint dependencies and relevant causalities, is not expected to happen in the standard VAR model, several sparse variants have been introduced in the literature. However, in some cases it might be of interest to control some dimensions of the sparsity, as e.g. the number of causal features allowed in the prediction. To authors extent none of the existent methods endows the user with full control over the different aspects of the sparsity of the solution...
September 21, 2016: Biostatistics
Ying Huang, Peter B Gilbert, Rong Fu, Holly Janes
SummaryBiomarker endpoints measuring vaccine-induced immune responses are essential to HIV vaccine development because of their potential to predict the effect of a vaccine in preventing HIV infection. A vaccine's immune response profile observed in phase I immunogenicity studies is a key factor in determining whether it is advanced for further study in phase II and III efficacy trials. The multiplicity of immune variables and scientific uncertainty in their relative importance, however, pose great challenges to the development of formal algorithms for selecting vaccines to study further...
September 20, 2016: Biostatistics
Eugen Pircalabelu, Gerda Claeskens, Lourens J Waldorp
SummaryWe have developed a method for estimating brain networks from fMRI datasets that have not all been measured using the same set of brain regions. Some of the coarse scale regions have been split in smaller subregions. The proposed penalized estimation procedure selects undirected graphical models with similar structures that combine information from several subjects and several coarseness scales. Both within-scale edges and between-scale edges that identify possible connections between a large region and its subregions are estimated...
October 2016: Biostatistics
Joseph Antonelli, Matthew Cefalu, Luke Bornn
SummaryIn environmental epidemiology, exposures are not always available at subject locations and must be predicted using monitoring data. The monitor locations are often outside the control of researchers, and previous studies have shown that "preferential sampling" of monitoring locations can adversely affect exposure prediction and subsequent health effect estimation. We adopt a slightly different definition of preferential sampling than is typically seen in the literature, which we call population-based preferential sampling...
October 2016: Biostatistics
Gianluca Frasso, Philippe Lambert
SummaryThe 2014 Ebola outbreak in Sierra Leone is analyzed using a susceptible-exposed-infectious-removed (SEIR) epidemic compartmental model. The discrete time-stochastic model for the epidemic evolution is coupled to a set of ordinary differential equations describing the dynamics of the expected proportions of subjects in each epidemic state. The unknown parameters are estimated in a Bayesian framework by combining data on the number of new (laboratory confirmed) Ebola cases reported by the Ministry of Health and prior distributions for the transition rates elicited using information collected by the WHO during the follow-up of specific Ebola cases...
October 2016: Biostatistics
Jonathan W Bartlett, Jeremy M G Taylor
Studies often follow individuals until they fail from one of a number of competing failure types. One approach to analyzing such competing risks data involves modeling the cause-specific hazards as functions of baseline covariates. A common issue that arises in this context is missing values in covariates. In this setting, we first establish conditions under which complete case analysis (CCA) is valid. We then consider application of multiple imputation to handle missing covariate values, and extend the recently proposed substantive model compatible version of fully conditional specification (SMC-FCS) imputation to the competing risks setting...
October 2016: Biostatistics
Marie-Karelle Riviere, Sebastian Ueckert, France Mentré
Non-linear mixed effect models (NLMEMs) are widely used for the analysis of longitudinal data. To design these studies, optimal design based on the expected Fisher information matrix (FIM) can be used instead of performing time-consuming clinical trial simulations. In recent years, estimation algorithms for NLMEMs have transitioned from linearization toward more exact higher-order methods. Optimal design, on the other hand, has mainly relied on first-order (FO) linearization to calculate the FIM. Although efficient in general, FO cannot be applied to complex non-linear models and with difficulty in studies with discrete data...
October 2016: Biostatistics
Brisa N Sánchez, Meihua Wu, Peter X K Song, Wen Wang
Advances in high throughput technology have accelerated the use of hundreds to millions of biomarkers to construct classifiers that partition patients into different clinical conditions. Prior to classifier development in actual studies, a critical need is to determine the sample size required to reach a specified classification precision. We develop a systematic approach for sample size determination in high-dimensional (large [Formula: see text] small [Formula: see text]) classification analysis. Our method utilizes the probability of correct classification (PCC) as the optimization objective function and incorporates the higher criticism thresholding procedure for classifier development...
October 2016: Biostatistics
Federico Ambrogi, Thomas H Scheike
High-dimensional regression has become an increasingly important topic for many research fields. For example, biomedical research generates an increasing amount of data to characterize patients' bio-profiles (e.g. from a genomic high-throughput assay). The increasing complexity in the characterization of patients' bio-profiles is added to the complexity related to the prolonged follow-up of patients with the registration of the occurrence of possible adverse events. This information may offer useful insight into disease dynamics and in identifying subset of patients with worse prognosis and better response to the therapy...
October 2016: Biostatistics
Jennifer A Sinnott, Tianxi Cai
When a moderate number of potential predictors are available and a survival model is fit with regularization to achieve variable selection, providing accurate inference on the predicted survival can be challenging. We investigate inference on the predicted survival estimated after fitting a Cox model under regularization guaranteeing the oracle property. We demonstrate that existing asymptotic formulas for the standard errors of the coefficients tend to underestimate the variability for some coefficients, while typical resampling such as the bootstrap tends to overestimate it; these approaches can both lead to inaccurate variance estimation for predicted survival functions...
October 2016: Biostatistics
Elisa Sheng, Daniela Witten, Xiao-Hua Zhou
In a multivariate setting, we consider the task of identifying features whose correlations with the other features differ across conditions. Such correlation shifts may occur independently of mean shifts, or differences in the means of the individual features across conditions. Previous approaches for detecting correlation shifts consider features simultaneously, by computing a correlation-based test statistic for each feature. However, since correlations involve two features, such approaches do not lend themselves to identifying which feature is the culprit...
October 2016: Biostatistics
Ziwen Tan, Guoyou Qin, Haibo Zhou
Outcome-dependent sampling (ODS) designs have been well recognized as a cost-effective way to enhance study efficiency in both statistical literature and biomedical and epidemiologic studies. A partially linear additive model (PLAM) is widely applied in real problems because it allows for a flexible specification of the dependence of the response on some covariates in a linear fashion and other covariates in a nonlinear non-parametric fashion. Motivated by an epidemiological study investigating the effect of prenatal polychlorinated biphenyls exposure on children's intelligence quotient (IQ) at age 7 years, we propose a PLAM in this article to investigate a more flexible non-parametric inference on the relationships among the response and covariates under the ODS scheme...
October 2016: Biostatistics
Michael Rosenblum, Tianchen Qian, Yu Du, Huitong Qiu, Aaron Fisher
Adaptive enrichment designs involve preplanned rules for modifying enrollment criteria based on accrued data in an ongoing trial. For example, enrollment of a subpopulation where there is sufficient evidence of treatment efficacy, futility, or harm could be stopped, while enrollment for the remaining subpopulations is continued. We propose a new class of multiple testing procedures tailored to adaptive enrichment designs. The procedures synthesize ideas from two general approaches. As in the modified group sequential approach, the procedures gain power by leveraging the covariance among statistics for different stages and different hypotheses...
October 2016: Biostatistics
Xiaoguang Xu, Theodore Kypraios, Philip D O'Neill
This paper considers novel Bayesian non-parametric methods for stochastic epidemic models. Many standard modeling and data analysis methods use underlying assumptions (e.g. concerning the rate at which new cases of disease will occur) which are rarely challenged or tested in practice. To relax these assumptions, we develop a Bayesian non-parametric approach using Gaussian Processes, specifically to estimate the infection process. The methods are illustrated with both simulated and real data sets, the former illustrating that the methods can recover the true infection process quite well in practice, and the latter illustrating that the methods can be successfully applied in different settings...
October 2016: Biostatistics
David S Robertson, A Toby Prevost, Jack Bowden
The problem of selection bias has long been recognized in the analysis of two-stage trials, where promising candidates are selected in stage 1 for confirmatory analysis in stage 2. To efficiently correct for bias, uniformly minimum variance conditionally unbiased estimators (UMVCUEs) have been proposed for a wide variety of trial settings, but where the population parameter estimates are assumed to be independent. We relax this assumption and derive the UMVCUE in the multivariate normal setting with an arbitrary known covariance structure...
October 2016: Biostatistics
Ruoqing Zhu, Qing Zhao, Hongyu Zhao, Shuangge Ma
In multidimensional cancer omics studies, one subject is profiled on multiple layers of omics activities. In this article, the goal is to integrate multiple types of omics measurements, identify markers, and build a model for cancer outcome. The proposed analysis is achieved in two steps. In the first step, we analyze the regulation among different types of omics measurements, through the construction of linear regulatory modules (LRMs). The LRMs have sound biological basis, and their construction differs from the existing analyses by modeling the regulation of sets of gene expressions (GEs) by sets of regulators...
October 2016: Biostatistics
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"