journal

# Biostatistics

journal
#1
Wenjian Bi, Yun Li, Matthew P Smeltzer, Guimin Gao, Shengli Zhao, Guolian Kang
It has been well acknowledged that methods for secondary trait (ST) association analyses under a case-control design (ST$_{\text{CC}}$) should carefully consider the sampling process to avoid biased risk estimates. A similar situation also exists in the extreme phenotype sequencing (EPS) designs, which is to select subjects with extreme values of continuous primary phenotype for sequencing. EPS designs are commonly used in modern epidemiological and clinical studies such as the well-known National Heart, Lung, and Blood Institute Exome Sequencing Project...
July 11, 2018: Biostatistics
#2
David Gerard, Matthew Stephens
We combine two important ideas in the analysis of large-scale genomics experiments (e.g. experiments that aim to identify genes that are differentially expressed between two conditions). The first is use of Empirical Bayes (EB) methods to handle the large number of potentially-sparse effects, and estimate false discovery rates and related quantities. The second is use of factor analysis methods to deal with sources of unwanted variation such as batch effects and unmeasured confounders. We describe a simple modular fitting procedure that combines key ideas from both these lines of research...
July 6, 2018: Biostatistics
#3
Giorgio Paulon, Maria De Iorio, Alessandra Guglielmi, Francesca Ieva
Heart failure (HF) is one of the main causes of morbidity, hospitalization, and death in the western world, and the economic burden associated with HF management is relevant and expected to increase in the future. We consider hospitalization data for HF in the most populated Italian Region, Lombardia. Data were extracted from the administrative data warehouse of the regional healthcare system. The main clinical outcome of interest is time to death and research focus is on investigating how recurrent hospitalizations affect the time to event...
July 6, 2018: Biostatistics
#4
Zheng-Zheng Tang, Guanhua Chen
There is heightened interest in using high-throughput sequencing technologies to quantify abundances of microbial taxa and linking the abundance to human diseases and traits. Proper modeling of multivariate taxon counts is essential to the power of detecting this association. Existing models are limited in handling excessive zero observations in taxon counts and in flexibly accommodating complex correlation structures and dispersion patterns among taxa. In this article, we develop a new probability distribution, zero-inflated generalized Dirichlet multinomial (ZIGDM), that overcomes these limitations in modeling multivariate taxon counts...
June 24, 2018: Biostatistics
#5
Rachel Carroll, Andrew B Lawson, Shanshan Zhao
The introduction of spatial and temporal frailty parameters in survival models furnishes a way to represent unmeasured confounding in the outcome of interest. Using a Bayesian accelerated failure time model, we are able to flexibly explore a wide range of spatial and temporal options for structuring frailties as well as examine the benefits of using these different structures in certain settings. A setting of particular interest for this work involved using temporal frailties to capture the impact of events of interest on breast cancer survival...
June 24, 2018: Biostatistics
#6
Jiebiao Wang, Pei Wang, Donald Hedeker, Lin S Chen
In quantitative proteomics, mass tag labeling techniques have been widely adopted in mass spectrometry experiments. These techniques allow peptides (short amino acid sequences) and proteins from multiple samples of a batch being detected and quantified in a single experiment, and as such greatly improve the efficiency of protein profiling. However, the batch-processing of samples also results in severe batch effects and non-ignorable missing data occurring at the batch level. Motivated by the breast cancer proteomic data from the Clinical Proteomic Tumor Analysis Consortium, in this work, we developed two tailored multivariate MIxed-effects SElection models (mvMISE) to jointly analyze multiple correlated peptides/proteins in labeled proteomics data, considering the batch effects and the non-ignorable missingness...
June 24, 2018: Biostatistics
#7
Ekaterina Smirnova, Snehalata Huzurbazar, Farhad Jafari
The human microbiota composition is associated with a number of diseases including obesity, inflammatory bowel disease, and bacterial vaginosis. Thus, microbiome research has the potential to reshape clinical and therapeutic approaches. However, raw microbiome count data require careful pre-processing steps that take into account both the sparsity of counts and the large number of taxa that are being measured. Filtering is defined as removing taxa that are present in a small number of samples and have small counts in the samples where they are observed...
June 18, 2018: Biostatistics
#8
Gary Napier, Duncan Lee, Chris Robertson, Andrew Lawson
Population-level disease risk across a set of non-overlapping areal units varies in space and time, and a large research literature has developed methodology for identifying clusters of areal units exhibiting elevated risks. However, almost no research has extended the clustering paradigm to identify groups of areal units exhibiting similar temporal disease trends. We present a novel Bayesian hierarchical mixture model for achieving this goal, with inference based on a Metropolis-coupled Markov chain Monte Carlo ((MC)$^3$) algorithm...
June 18, 2018: Biostatistics
#9
Laurent Jacob, Florence Combes, Thomas Burger
We propose a new hypothesis test for the differential abundance of proteins in mass-spectrometry based relative quantification. An important feature of this type of high-throughput analyses is that it involves an enzymatic digestion of the sample proteins into peptides prior to identification and quantification. Due to numerous homology sequences, different proteins can lead to peptides with identical amino acid chains, so that their parent protein is ambiguous. These so-called shared peptides make the protein-level statistical analysis a challenge and are often not accounted for...
June 18, 2018: Biostatistics
#10
Claus Thorn Ekstrøm, Thomas Alexander Gerds, Andreas Kryger Jensen
The comparison of alternative rankings of a set of items is a general and common task in applied statistics. Predictor variables are ranked according to magnitude of association with an outcome, prediction models rank subjects according to the personalized risk of an event, and genetic studies rank genes according to their difference in gene expression levels. We propose a sequential rank agreement measure to quantify the rank agreement among two or more ordered lists. This measure has an intuitive interpretation, it can be applied to any number of lists even if some are partially incomplete, and it provides information about the agreement along the lists...
June 3, 2018: Biostatistics
#11
Kris Sankaran, Susan P Holmes
The human microbiome is a complex ecological system, and describing its structure and function under different environmental conditions is important from both basic scientific and medical perspectives. Viewed through a biostatistical lens, many microbiome analysis goals can be formulated as latent variable modeling problems. However, although probabilistic latent variable models are a cornerstone of modern unsupervised learning, they are rarely applied in the context of microbiome data analysis, in spite of the evolutionary, temporal, and count structure that could be directly incorporated through such models...
June 3, 2018: Biostatistics
#12
Qiwei Li, Xinlei Wang, Faming Liang, Faliu Yi, Yang Xie, Adi Gazdar, Guanghua Xiao
Digital pathology imaging of tumor tissues, which captures histological details in high resolution, is fast becoming a routine clinical procedure. Recent developments in deep-learning methods have enabled the identification, characterization, and classification of individual cells from pathology images analysis at a large scale. This creates new opportunities to study the spatial patterns of and interactions among different types of cells. Reliable statistical approaches to modeling such spatial patterns and interactions can provide insight into tumor progression and shed light on the biological mechanisms of cancer...
May 18, 2018: Biostatistics
#13
Fulton Wang, Cynthia Rudin, Tyler H Mccormick, John L Gore
In many clinical settings, a patient outcome takes the form of a scalar time series with a recovery curve shape, which is characterized by a sharp drop due to a disruptive event (e.g., surgery) and subsequent monotonic smooth rise towards an asymptotic level not exceeding the pre-event value. We propose a Bayesian model that predicts recovery curves based on information available before the disruptive event. A recovery curve of interest is the quantified sexual function of prostate cancer patients after prostatectomy surgery...
May 5, 2018: Biostatistics
#14
Gen Li, Andrey A Shabalin, Ivan Rusyn, Fred A Wright, Andrew B Nobel
Expression quantitative trait locus (eQTL) analyses identify genetic markers associated with the expression of a gene. Most up-to-date eQTL studies consider the connection between genetic variation and expression in a single tissue. Multi-tissue analyses have the potential to improve findings in a single tissue, and elucidate the genotypic basis of differences between tissues. In this article, we develop a hierarchical Bayesian model (MT-eQTL) for multi-tissue eQTL analysis. MT-eQTL explicitly captures patterns of variation in the presence or absence of eQTL, as well as the heterogeneity of effect sizes across tissues...
July 1, 2018: Biostatistics
#15
Tingting Yu, Lang Wu, Peter B Gilbert
In HIV vaccine studies, a major research objective is to identify immune response biomarkers measured longitudinally that may be associated with risk of HIV infection. This objective can be assessed via joint modeling of longitudinal and survival data. Joint models for HIV vaccine data are complicated by the following issues: (i) left truncations of some longitudinal data due to lower limits of quantification; (ii) mixed types of longitudinal variables; (iii) measurement errors and missing values in longitudinal measurements; (iv) computational challenges associated with likelihood inference...
July 1, 2018: Biostatistics
#16
Erin E Gabriel, Michael C Sachs, M Elizabeth Halloran
An intermediate response measure that accurately predicts efficacy in a new setting at the individual level could be used both for prediction and personalized medical decisions. In this article, we define a predictive individual-level general surrogate (PIGS), which is an individual-level intermediate response that can be used to accurately predict individual efficacy in a new setting. While methods for evaluating trial-level general surrogates, which are predictors of trial-level efficacy, have been developed previously, few, if any, methods have been developed to evaluate individual-level general surrogates, and no methods have formalized the use of cross-validation to quantify the expected prediction error...
July 1, 2018: Biostatistics
#17
Youyi Fong, Ying Huang, Maria P Lemos, M Juliana Mcelrath
Two-sample location problem is one of the most encountered problems in statistical practice. The two most commonly studied subtypes of two-sample location problem involve observations from two populations that are either independent or completely paired, but a third subtype can oftentimes occur in practice when some observations are paired and some are not. Partially paired two-sample problems, also known as paired two-sample problems with missing data, often arise in biomedical fields when it is difficult for some invasive procedures to collect data from an individual at both conditions we are interested in comparing...
July 1, 2018: Biostatistics
#18
Rodrigue Ngueyep, Nicoleta Serban
Many studies in health services research rely on regression models with a large number of covariates or predictors. In this article, we introduce novel methodology to estimate and perform model selection for high-dimensional non-parametric multivariate regression problems, with application to many healthcare studies. We particularly focus on multi-responses or multi-task regression models. Because of the complexity of the dependence between predictors and the multiple responses, we exploit model selection approaches that consider various level of groupings between and within responses...
July 1, 2018: Biostatistics
#19
Jeremy Roth, Noah Simon
An effective treatment may only benefit a subset of patients enrolled in a clinical trial. We translate the search for patient characteristics that predict treatment benefit to a search for qualitative interactions, which occur when the estimated response-curve under treatment crosses the estimated response-curve under control. We propose a regression-based framework that tests for qualitative interactions without assuming linearity or requiring pre-specified risk strata; this flexibility is useful in settings where there is limited a priori scientific knowledge about the relationship between features and the response...
July 1, 2018: Biostatistics
#20
William Barcella, Maria De Iorio, Stefano Favaro, Gary L Rosner
We propose a novel Bayesian nonparametric process prior for modeling a collection of random discrete distributions. This process is defined by including a suitable Beta regression framework within a generalized Dirichlet process to induce dependence among the discrete random distributions. This strategy allows for covariate dependent clustering of the observations. Some advantages of the proposed approach include wide applicability, ease of interpretation, and availability of efficient MCMC algorithms. The motivation for this work is the study of the impact of asparginage metabolism on lipid levels in a group of pediatric patients treated for acute lymphoblastic leukemia...
July 1, 2018: Biostatistics
journal
journal
34811
1
2
Fetch more papers »
Fetching more papers...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.

### Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign