journal
MENU ▼
Read by QxMD icon Read
search

Annals of Applied Statistics

journal
https://www.readbyqxmd.com/read/29731955/powerful-test-based-on-conditional-effects-for-genome-wide-screening
#1
Yaowu Liu, Jun Xie
This paper considers testing procedures for screening large genome-wide data, where we examine hundreds of thousands of genetic variants, e.g., single nucleotide polymorphisms (SNP), on a quantitative phenotype. We screen the whole genome by SNP sets and propose a new test that is based on conditional effects from multiple SNPs. The test statistic is developed for weak genetic effects and incorporates correlations among genetic variables, which may be very high due to linkage disequilibrium. The limiting null distribution of the test statistic and the power of the test are derived...
March 2018: Annals of Applied Statistics
https://www.readbyqxmd.com/read/29731954/msiq-joint-modeling-of-multiple-rna-seq-samples-for-accurate-isoform-quantification
#2
Wei Vivian Li, Anqi Zhao, Shihua Zhang, Jingyi Jessica Li
Next-generation RNA sequencing (RNA-seq) technology has been widely used to assess full-length RNA isoform abundance in a high-throughput manner. RNA-seq data offer insight into gene expression levels and transcriptome structures, enabling us to better understand the regulation of gene expression and fundamental biological processes. Accurate isoform quantification from RNA-seq data is challenging due to the information loss in sequencing experiments. A recent accumulation of multiple RNA-seq data sets from the same tissue or cell type provides new opportunities to improve the accuracy of isoform quantification...
March 2018: Annals of Applied Statistics
https://www.readbyqxmd.com/read/29606991/design-of-vaccine-trials-during-outbreaks-with-and-without-a-delayed-vaccination-comparator
#3
Natalie E Dean, M Elizabeth Halloran, Ira M Longini
Conducting vaccine efficacy trials during outbreaks of emerging pathogens poses particular challenges. The "Ebola ça suffit" trial in Guinea used a novel ring vaccination cluster randomized design to target populations at highest risk of infection. Another key feature of the trial was the use of a delayed vaccination arm as a comparator, in which clusters were randomized to immediate vaccination or vaccination 21 days later. This approach, chosen to improve ethical acceptability of the trial, complicates the statistical analysis as participants in the comparison arm are eventually protected by vaccine...
March 2018: Annals of Applied Statistics
https://www.readbyqxmd.com/read/29515717/a-unified-framework-for-variance-component-estimation-with-summary-statistics-in-genome-wide-association-studies
#4
Xiang Zhou
Linear mixed models (LMMs) are among the most commonly used tools for genetic association studies. However, the standard method for estimating variance components in LMMs-the restricted maximum likelihood estimation method (REML)-suffers from several important drawbacks: REML requires individual-level genotypes and phenotypes from all samples in the study, is computationally slow, and produces downward-biased estimates in case control studies. To remedy these drawbacks, we present an alternative framework for variance component estimation, which we refer to as MQS...
December 2017: Annals of Applied Statistics
https://www.readbyqxmd.com/read/29721127/latent-space-models-for-multiview-network-data
#5
Michael Salter-Townshend, Tyler H McCormick
Social relationships consist of interactions along multiple dimensions. In social networks, this means that individuals form multiple types of relationships with the same person (e.g., an individual will not trust all of his/her acquaintances). Statistical models for these data require understanding two related types of dependence structure: (i) structure within each relationship type, or network view, and (ii) the association between views. In this paper, we propose a statistical framework that parsimoniously represents dependence between relationship types while also maintaining enough flexibility to allow individuals to serve different roles in different relationship types...
September 2017: Annals of Applied Statistics
https://www.readbyqxmd.com/read/29479394/a-novel-and-efficient-algorithm-for-de-novo-discovery-of-mutated-driver-pathways-in-cancer
#6
Binghui Liu, Chong Wu, Xiaotong Shen, Wei Pan
Next-generation sequencing studies on cancer somatic mutations have discovered that driver mutations tend to appear in most tumor samples, but they barely overlap in any single tumor sample, presumably because a single driver mutation can perturb the whole pathway. Based on the corresponding new concepts of coverage and mutual exclusivity, new methods can be designed for de novo discovery of mutated driver pathways in cancer. Since the computational problem is a combinatorial optimization with an objective function involving a discontinuous indicator function in high dimension, many existing optimization algorithms, such as a brute force enumeration, gradient descent and Newton's methods, are practically infeasible or directly inapplicable...
September 2017: Annals of Applied Statistics
https://www.readbyqxmd.com/read/29308102/doubly-robust-estimation-of-optimal-treatment-regimes-for-survival-data-with-application-to-an-hiv-aids-study
#7
Runchao Jiang, Wenbin Lu, Rui Song, Michael G Hudgens, Sonia Naprvavnik
In many biomedical settings, assigning every patient the same treatment may not be optimal due to patient heterogeneity. Individualized treatment regimes have the potential to dramatically improve clinical outcomes. When the primary outcome is censored survival time, a main interest is to find optimal treatment regimes that maximize the survival probability of patients. Since the survival curve is a function of time, it is important to balance short-term and long-term benefit when assigning treatments. In this paper, we propose a doubly robust approach to estimate optimal treatment regimes that optimize a user specified function of the survival curve, including the restricted mean survival time and the median survival time...
September 2017: Annals of Applied Statistics
https://www.readbyqxmd.com/read/29152032/latent-class-modeling-using-matrix-covariates-with-application-to-identifying-early-placebo-responders-based-on-eeg-signals
#8
Bei Jiang, Eva Petkova, Thaddeus Tarpey, R Todd Ogden
Latent class models are widely used to identify unobserved subgroups (i.e., latent classes) based upon one or more manifest variables. The probability of belonging to each subgroup is typically modeled as a function of a set of measured covariates. In this paper, we extend existing latent class models to incorporate matrix covariates. This research is motivated by a randomized placebo-controlled depression clinical trial. One study goal is to identify a subgroup of subjects who experience symptoms improvement early on during antidepressant treatment, which is considered to be an indication of a placebo rather than a true pharmacological response...
September 2017: Annals of Applied Statistics
https://www.readbyqxmd.com/read/29081874/testing-high-dimensional-covariance-matrices-with-application-to-detecting-schizophrenia-risk-genes
#9
Lingxue Zhu, Jing Lei, Bernie Devlin, Kathryn Roeder
Scientists routinely compare gene expression levels in cases versus controls in part to determine genes associated with a disease. Similarly, detecting case-control differences in co-expression among genes can be critical to understanding complex human diseases; however statistical methods have been limited by the high dimensional nature of this problem. In this paper, we construct a sparse-Leading-Eigenvalue-Driven (sLED) test for comparing two high-dimensional covariance matrices. By focusing on the spectrum of the differential matrix, sLED provides a novel perspective that accommodates what we assume to be common, namely sparse and weak signals in gene expression data, and it is closely related with Sparse Principal Component Analysis...
September 2017: Annals of Applied Statistics
https://www.readbyqxmd.com/read/29081873/dynamic-prediction-for-multiple-repeated-measures-and-event-time-data-an-application-to-parkinson-s-disease
#10
Jue Wang, Sheng Luo, Liang Li
In many clinical trials studying neurodegenerative diseases such as Parkinson's disease (PD), multiple longitudinal outcomes are collected to fully explore the multidimensional impairment caused by this disease. If the outcomes deteriorate rapidly, patients may reach a level of functional disability sufficient to initiate levodopa therapy for ameliorating disease symptoms. An accurate prediction of the time to functional disability is helpful for clinicians to monitor patients' disease progression and make informative medical decisions...
September 2017: Annals of Applied Statistics
https://www.readbyqxmd.com/read/29250210/quantification-of-multiple-tumor-clones-using-gene-array-and-sequencing-data
#11
Yichen Cheng, James Y Dai, Thomas G Paulson, Xiaoyu Wang, Xiaohong Li, Brian J Reid, Charles Kooperberg
Cancer development is driven by genomic alterations, including copy number aberrations. The detection of copy number aberrations in tumor cells is often complicated by possible contamination of normal stromal cells in tumor samples and intratumor heterogeneity, namely the presence of multiple clones of tumor cells. In order to correctly quantify copy number aberrations, it is critical to successfully de-convolute the complex structure of the genetic information from tumor samples. In this article, we propose a general Bayesian method for estimating copy number aberrations when there are normal cells and potentially more than one tumor clones...
June 2017: Annals of Applied Statistics
https://www.readbyqxmd.com/read/28989557/allele-specific-copy-number-estimation-by-whole-exome-sequencing
#12
Hao Chen, Yuchao Jiang, Kara N Maxwell, Katherine L Nathanson, Nancy Zhang
Whole exome sequencing is currently a technology of choice in large-scale cancer genomics studies, where the priority is to identify cancer-associated variants in coding regions. We describe a method for estimating allele-specific copy number using whole exome sequencing data from tumor and matched normal.
June 2017: Annals of Applied Statistics
https://www.readbyqxmd.com/read/28959370/integrative-sparse-k-means-with-overlapping-group-lasso-in-genomic-applications-for-disease-subtype-discovery
#13
Zhiguang Huo, George Tseng
Cancer subtypes discovery is the first step to deliver personalized medicine to cancer patients. With the accumulation of massive multi-level omics datasets and established biological knowledge databases, omics data integration with incorporation of rich existing biological knowledge is essential for deciphering a biological mechanism behind the complex diseases. In this manuscript, we propose an integrative sparse K-means (is-K means) approach to discover disease subtypes with the guidance of prior biological knowledge via sparse overlapping group lasso...
June 2017: Annals of Applied Statistics
https://www.readbyqxmd.com/read/28943991/improving-efficiency-in-biomarker-incremental-value-evaluation-under-two-phase-designs
#14
Yingye Zheng, Marshall Brown, Anna Lok, Tianxi Cai
Cost-effective yet efficient designs are critical to the success of biomarker evaluation research. Two-phase sampling designs, under which expensive markers are only measured on a subsample of cases and non-cases within a prospective cohort, are useful in novel biomarker studies for preserving study samples and minimizing cost of biomarker assaying. Statistical methods for quantifying the predictiveness of biomarkers under two-phase studies have been proposed (Cai and Zheng, 2012; Liu, Cai and Zheng, 2012)...
June 2017: Annals of Applied Statistics
https://www.readbyqxmd.com/read/29743963/a-mixed-effects-model-for-incomplete-data-from-labeling-based-quantitative-proteomics-experiments
#15
Lin S Chen, Jiebiao Wang, Xianlong Wang, Pei Wang
In mass spectrometry (MS) based quantitative proteomics research, the emerging iTRAQ (isobaric tag for relative and absolute quantitation) and TMT (tandem mass tags) techniques have been widely adopted for high throughput protein profiling. In a typical iTRAQ/TMT proteomics study, samples are grouped into batches, and each batch is processed by one multiplex experiment, in which the abundances of thousands of proteins/peptides in a batch of samples can be measured simultaneously. The multiplex labeling technique greatly enhances the throughput of protein quantification...
March 2017: Annals of Applied Statistics
https://www.readbyqxmd.com/read/29276550/inference-for-social-network-models-from-egocentrically-sampled-data-with-application-to-understanding-persistent-racial-disparities-in-hiv-prevalence-in-the-us
#16
Pavel N Krivitsky, Martina Morris
Egocentric network sampling observes the network of interest from the point of view of a set of sampled actors, who provide information about themselves and anonymized information on their network neighbors. In survey research, this is often the most practical, and sometimes the only, way to observe certain classes of networks, with the sexual networks that underlie HIV transmission being the archetypal case. Although methods exist for recovering some descriptive network features, there is no rigorous and practical statistical foundation for estimation and inference for network models from such data...
March 2017: Annals of Applied Statistics
https://www.readbyqxmd.com/read/28979611/forecasting-seasonal-influenza-with-a-state-space-sir-model
#17
Dave Osthus, Kyle S Hickmann, Petruţa C Caragea, Dave Higdon, Sara Y Del Valle
Seasonal influenza is a serious public health and societal problem due to its consequences resulting from absenteeism, hospitalizations, and deaths. The overall burden of influenza is captured by the Centers for Disease Control and Prevention's influenza-like illness network, which provides invaluable information about the current incidence. This information is used to provide decision support regarding prevention and response efforts. Despite the relatively rich surveillance data and the recurrent nature of seasonal influenza, forecasting the timing and intensity of seasonal influenza in the U...
March 2017: Annals of Applied Statistics
https://www.readbyqxmd.com/read/28572869/covariate-adaptive-clustering-of-exposures-for-air-pollution-epidemiology-cohorts
#18
Joshua P Keller, Mathias Drton, Timothy Larson, Joel D Kaufman, Dale P Sandler, Adam A Szpiro
Cohort studies in air pollution epidemiology aim to establish associations between health outcomes and air pollution exposures. Statistical analysis of such associations is complicated by the multivariate nature of the pollutant exposure data as well as the spatial misalignment that arises from the fact that exposure data are collected at regulatory monitoring network locations distinct from cohort locations. We present a novel clustering approach for addressing this challenge. Specifically, we present a method that uses geographic covariate information to cluster multi-pollutant observations and predict cluster membership at cohort locations...
March 2017: Annals of Applied Statistics
https://www.readbyqxmd.com/read/28408966/gene-network-reconstruction-using-global-local-shrinkage-priors
#19
Gwenaël G R Leday, Mathisca C M de Gunst, Gino B Kpogbezan, Aad W van der Vaart, Wessel N van Wieringen, Mark A van de Wiel
Reconstructing a gene network from high-throughput molecular data is an important but challenging task, as the number of parameters to estimate easily is much larger than the sample size. A conventional remedy is to regularize or penalize the model likelihood. In network models, this is often done locally in the neighbourhood of each node or gene. However, estimation of the many regularization parameters is often difficult and can result in large statistical uncertainties. In this paper we propose to combine local regularization with global shrinkage of the regularization parameters to borrow strength between genes and improve inference...
March 2017: Annals of Applied Statistics
https://www.readbyqxmd.com/read/29399241/bayesian-large-scale-multiple-regression-with-summary-statistics-from-genome-wide-association-studies
#20
Xiang Zhu, Matthew Stephens
Bayesian methods for large-scale multiple regression provide attractive approaches to the analysis of genome-wide association studies (GWAS). For example, they can estimate heritability of complex traits, allowing for both polygenic and sparse models; and by incorporating external genomic data into the priors, they can increase power and yield new biological insights. However, these methods require access to individual genotypes and phenotypes, which are often not easily available. Here we provide a framework for performing these analyses without individual-level data...
2017: Annals of Applied Statistics
journal
journal
42077
1
2
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read
×

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"