Read by QxMD icon Read

Statistics in Biosciences

John D Rice, Jeremy M G Taylor
One common use of binary response regression methods is classification based on an arbitrary probability threshold dictated by the particular application. Since this is given to us a priori, it is sensible to incorporate the threshold into our estimation procedure. Specifically, for the linear logistic model, we solve a set of locally weighted score equations, using a kernel-like weight function centered at the threshold. The bandwidth for the weight function is selected by cross validation of a novel hybrid loss function that combines classification error and a continuous measure of divergence between observed and fitted values; other possible cross-validation functions based on more common binary classification metrics are also examined...
October 2016: Statistics in Biosciences
Zhaohui Qin, Ben Li, Karen N Conneely, Hao Wu, Ming Hu, Deepak Ayyala, Yongseok Park, Victor X Jin, Fangyuan Zhang, Han Zhang, Li Li, Shili Lin
With the rapid development of high throughput technologies such as array and next generation sequencing (NGS), genome-wide, nucleotide-resolution epigenomic data are increasingly available. In recent years, there has been particular interest in data on DNA methylation and 3-dimensional (3D) chromosomal organization, which are believed to hold keys to understand biological mechanisms, such as transcription regulation, that are closely linked to human health and diseases. However, small sample size, complicated correlation structure, substantial noise, biases, and uncertainties, all present difficulties for performing statistical inference...
October 2016: Statistics in Biosciences
Loni Philip Tabb, Eric J Tchetgen Tchetgen, Greg A Wellenius, Brent A Coull
Count data often exhibit more zeros than predicted by common count distributions like the Poisson or negative binomial. In recent years, there has been considerable interest in methods for analyzing zero-inflated count data in longitudinal or other correlated data settings. A common approach has been to extend zero-inflated Poisson models to include random effects that account for correlation among observations. However, these models have been shown to have a few drawbacks, including interpretability of regression coefficients and numerical instability of fitting algorithms even when the data arise from the assumed model...
October 2016: Statistics in Biosciences
Daowen Zhang, Jie Lena Sun, Karen Pieper
Linear mixed effects models are widely used to analyze a clustered response variable. Motivated by a recent study to examine and compare the hospital length of stay (LOS) between patients undertaking percutaneous coronary intervention (PCI) and coronary artery bypass graft (CABG) from several international clinical trials, we proposed a bivariate linear mixed effects model for the joint modeling of clustered PCI and CABG LOS's where each clinical trial is considered a cluster. Due to the large number of patients in some trials, commonly used commercial statistical software for fitting (bivariate) linear mixed models failed to run since it could not allocate enough memory to invert large dimensional matrices during the optimization process...
October 2016: Statistics in Biosciences
Jesse D Raffa, Elizabeth A Thompson
Correlation between study units in quantitative genetics studies often makes it difficult to compare important inferential aspects of studies. Describing the relatedness between study units is critical to capture features of pedigree studies involving heritability, including power and precision of heritability estimates. Blangero et al (2012) showed that in pedigree studies the power to detect heritability is a function of the true heritability and the eigenvalues of the kinship matrix. We extend this to a more general setting which allows statements about expected precision of heritability estimates...
October 2016: Statistics in Biosciences
Yanxun Xu, Lorenzo Trippa, Peter Müller, Yuan Ji
Targeted therapies based on biomarker profiling are becoming a mainstream direction of cancer research and treatment. Depending on the expression of specific prognostic biomarkers, targeted therapies assign different cancer drugs to subgroups of patients even if they are diagnosed with the same type of cancer by traditional means, such as tumor location. For example, Herceptin is only indicated for the subgroup of patients with HER2+ breast cancer, but not other types of breast cancer. However, subgroups like HER2+ breast cancer with effective targeted therapies are rare and most cancer drugs are still being applied to large patient populations that include many patients who might not respond or benefit...
June 2016: Statistics in Biosciences
Xuemin Gu, Nan Chen, Caimiao Wei, Suyu Liu, Vassiliki A Papadimitrakopoulou, Roy S Herbst, J Jack Lee
We propose a Bayesian two-stage biomarker-based adaptive randomization (AR) design for the development of targeted agents. The design has three main goals: (1) to test the treatment efficacy, (2) to identify prognostic and predictive markers for the targeted agents, and (3) to provide better treatment for patients enrolled in the trial. To treat patients better, both stages are guided by the Bayesian AR based on the individual patient's biomarker profiles. The AR in the first stage is based on a known marker...
June 2016: Statistics in Biosciences
Jared C Foster, Bin Nan, Lei Shen, Niko Kaciroti, Jeremy M G Taylor
We consider the problem of using permutation-based methods to test for treatment-covariate interactions from randomized clinical trial data. Testing for interactions is common in the field of personalized medicine, as subgroups with enhanced treatment effects arise when treatment-by-covariate interactions exist. Asymptotic tests can often be performed for simple models, but in many cases, more complex methods are used to identify subgroups, and non-standard test statistics proposed, and asymptotic results may be difficult to obtain...
June 2016: Statistics in Biosciences
Gilbert S Omenn
Omics-based technology platforms have made new kinds of cancer profiling tests feasible. There are several valuable examples in clinical practice, and many more under development. A concerted, transparent process of discovery with lock-down of candidate assays and classifiers and clear specification of intended clinical use is essential. The Institute of Medicine has now proposed a three-stage scheme of confirming and validating analytical findings, validating performance on clinical specimens, and demonstrating explicit clinical utility for an approvable test (Micheel et al...
June 2016: Statistics in Biosciences
Guogen Shan, Hua Zhang, Tao Jiang, Hanna Peterson, Daniel Young, Changxing Ma
In a one-sided hypothesis testing problem in clinical trials, the monotonic condition of a tail probability function is fundamentally important to guarantee that the actual type I and II error rates occur at the boundary of their associated parameter spaces. Otherwise, one has to search for the actual rates over the complete parameter space, which could be very computationally intensive. This important property has been extensively studied in traditional one-stage study settings (e.g., non-inferiority or superiority between two binomial proportions), but there is very limited research for this property in a two-stage design setting, e...
2016: Statistics in Biosciences
Qiongshi Lu, Chentian Jin, Jiehuan Sun, Russell Bowler, Katerina Kechris, Naftali Kaminski, Hongyu Zhao
Rich collections of genomic and epigenomic annotations, availabilities of large population cohorts for genome-wide association studies (GWAS), and advancements in data integration techniques provide the unprecedented opportunity to accelerate discoveries in complex disease studies through integrative analyses. In this paper, we apply a variety of approaches to integrate GWAS summary statistics of chronic obstructive pulmonary disease (COPD) with functional annotations to illustrate how data integration could help researchers understand complex human diseases...
2016: Statistics in Biosciences
Aidan G O'Keeffe, Daniel M Farewell, Brian D M Tom, Vernon T Farewell
In longitudinal randomised trials and observational studies within a medical context, a composite outcome-which is a function of several individual patient-specific outcomes-may be felt to best represent the outcome of interest. As in other contexts, missing data on patient outcome, due to patient drop-out or for other reasons, may pose a problem. Multiple imputation is a widely used method for handling missing data, but its use for composite outcomes has been seldom discussed. Whilst standard multiple imputation methodology can be used directly for the composite outcome, the distribution of a composite outcome may be of a complicated form and perhaps not amenable to statistical modelling...
2016: Statistics in Biosciences
Ying Huang, Eric Laber
For a patient who is facing a treatment decision, the added value of information provided by a biomarker depends on the individual patient's expected response to treatment with and without the biomarker, as well as his/her tolerance of disease and treatment harm. However, individualized estimators of the value of a biomarker are lacking. We propose a new graphical tool named the subject-specific expected benefit curve for quantifying the personalized value of a biomarker in aiding a treatment decision. We develop semiparametric estimators for two general settings: (i) when biomarker data are available from a randomized trial; and (ii) when biomarker data are available from a cohort or a cross-sectional study, together with external information about a multiplicative treatment effect...
2016: Statistics in Biosciences
Zifang Guo, Wenbin Lu, Lexin Li
Despite enormous development on variable selection approaches in recent years, modeling and selection of high dimensional censored regression remains a challenging question. When the number of predictors p far exceeds the number of observational units n and the outcome is censored, computations of existing solutions often become difficult, or even infeasible in some situations, while performances frequently deteriorate. In this article, we aim at simultaneous model estimation and variable selection for Cox proportional hazards models with high dimensional covariates...
October 1, 2015: Statistics in Biosciences
Philip S Boonstra, Jeremy M G Taylor, Bhramar Mukherjee
We propose an extension of the expectation-maximization (EM) algorithm, called the hyperpenalized EM (HEM) algorithm, that maximizes a penalized log-likelihood, for which some data are missing or unavailable, using a data-adaptive estimate of the penalty parameter. This is potentially useful in applications for which the analyst is unable or unwilling to choose a single value of a penalty parameter but instead can posit a plausible range of values. The HEM algorithm is conceptually straightforward and also very effective, and we demonstrate its utility in the analysis of a genomic data set...
October 1, 2015: Statistics in Biosciences
Xiang Zhan, Michael P Epstein, Debashis Ghosh
Recently, gene set-based approaches have become very popular in gene expression profiling studies for assessing how genetic variants are related to disease outcomes. Since most genes are not differentially expressed, existing pathway tests considering all genes within a pathway suffer from considerable noise and power loss. Moreover, for a differentially expressed pathway, it is of interest to select important genes that drive the effect of the pathway. In this article, we propose an adaptive association test using double kernel machines (DKM), which can both select important genes within the pathway as well as test for the overall genetic pathway effect...
October 1, 2015: Statistics in Biosciences
Herbert Pang, Inyoung Kim, Hongyu Zhao
Close to three percent of the world's population suffer from diabetes. Despite the range of treatment options available for diabetes patients, not all patients benefit from them. Investigating how different pathways correlate with phenotype of interest may help unravel novel drug targets and discover a possible cure. Many pathway-based methods have been developed to incorporate biological knowledge into the study of microarray data. Most of these methods can only analyze individual pathways but cannot deal with two or more pathways in a model based framework...
October 1, 2015: Statistics in Biosciences
Lai Wei, David Jarjoura
Motivated by laboratory experiments that fail to reach significance, we developed a small sample size approach to designing a subsequent experiment that controls overall type I error and achieves sufficient conditional power. We focus on experiments with leukemia cells, and use a specific example in Chronic Lymphocytic Leukemia to discuss unanticipated patient variance and difficult to predict interaction effect sizes. We emphasize the importance of achieving significance in the first run of an experiment, which results in simplifying the multiple considerations usually associated with interim analysis and decision making in adaptive clinical trials...
October 1, 2015: Statistics in Biosciences
Wentian Guo, Yang Ni, Yuan Ji
In many oncology clinical trials it is necessary to insert new candidate doses when the prespecified doses are poorly elicited. Formal statistical designs with dose insertion are lacking. We propose a dose insertion design for phase I/II clinical trials in oncology based on both efficacy and toxicity outcomes. We also implement Bayesian model selection during the course of the trial so that better models can be adaptively chosen to achieve more accurate inference. The new design, TEAMS, achieves great operating characteristics in extensive simulation studies due to its ability to adaptively insert new doses as well as perform model selection...
October 2015: Statistics in Biosciences
Brent A Johnson
Authors have observed that the distribution of medical expenditures has features that do not lend it to parametric modeling and can present significant challenges for least-squares-type estimators, even on a logarithmic scale. In this note, we discuss caveats and extensions of coefficient estimation in the bivariate accelerated lifetime model of medical cost and survival time on covariates. We consider the setting where medical cost is observed only when the event occurs and potential right-censoring of the event time induces a dependent censoring mechanism on cost...
October 2015: Statistics in Biosciences
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"