journal
MENU ▼
Read by QxMD icon Read
search

Annals of Applied Statistics

journal
https://www.readbyqxmd.com/read/30220953/topological-data-analysis-of-single-trial-electroencephalographic-signals
#1
Yuan Wang, Hernando Ombao, Moo K Chung
Epilepsy is a neurological disorder that can negatively affect the visual, audial and motor functions of the human brain. Statistical analysis of neurophysiological recordings, such as electroencephalogram (EEG), facilitates the understanding and diagnosis of epileptic seizures. Standard statistical methods, however, do not account for topological features embedded in EEG signals. In the current study, we propose a persistent homology (PH) procedure to analyze single-trial EEG signals. The procedure denoises signals with a weighted Fourier series (WFS), and tests for topological difference between the denoised signals with a permutation test based on their PH features persistence landscapes (PL)...
September 2018: Annals of Applied Statistics
https://www.readbyqxmd.com/read/30214655/adaptive-weight-burden-test-for-associations-between-quantitative-traits-and-genotype-data-with-complex-correlations
#2
Xiaowei Wu, Ting Guan, Dajiang J Liu, Luis G León Novelo, Dipankar Bandyopadhyay
High-throughput sequencing has often been used to screen samples from pedigrees or with population structure, producing genotype data with complex correlations rendered from both familial relation and linkage disequilibrium. With such data, it is critical to account for these genotypic correlations when assessing the contribution of variants by gene or pathway. Recognizing the limitations of existing association testing methods, we propose Adaptive-weight Burden Test (ABT), a retrospective, mixed-model test for genetic association of quantitative traits on genotype data with complex correlations...
September 2018: Annals of Applied Statistics
https://www.readbyqxmd.com/read/30224943/kernel-penalized-regression-for-analysis-of-microbiome-data
#3
Timothy W Randolph, Sen Zhao, Wade Copeland, Meredith Hullar, Ali Shojaie
The analysis of human microbiome data is often based on dimension-reduced graphical displays and clusterings derived from vectors of microbial abundances in each sample. Common to these ordination methods is the use of biologically motivated definitions of similarity. Principal coordinate analysis, in particular, is often performed using ecologically defined distances, allowing analyses to incorporate context-dependent, non-Euclidean structure. In this paper, we go beyond dimension-reduced ordination methods and describe a framework of high-dimensional regression models that extends these distance-based methods...
March 2018: Annals of Applied Statistics
https://www.readbyqxmd.com/read/30174778/a-unified-statistical-framework-for-single-cell-and-bulk-rna-sequencing-data
#4
Lingxue Zhu, Jing Lei, Bernie Devlin, Kathryn Roeder
Recent advances in technology have enabled the measurement of RNA levels for individual cells. Compared to traditional tissue-level bulk RNA-seq data, single cell sequencing yields valuable insights about gene expression profiles for different cell types, which is potentially critical for understanding many complex human diseases. However, developing quantitative tools for such data remains challenging because of high levels of technical noise, especially the "dropout" events. A "dropout" happens when the RNA for a gene fails to be amplified prior to sequencing, producing a "false" zero in the observed data...
March 2018: Annals of Applied Statistics
https://www.readbyqxmd.com/read/29731955/powerful-test-based-on-conditional-effects-for-genome-wide-screening
#5
Yaowu Liu, Jun Xie
This paper considers testing procedures for screening large genome-wide data, where we examine hundreds of thousands of genetic variants, e.g., single nucleotide polymorphisms (SNP), on a quantitative phenotype. We screen the whole genome by SNP sets and propose a new test that is based on conditional effects from multiple SNPs. The test statistic is developed for weak genetic effects and incorporates correlations among genetic variables, which may be very high due to linkage disequilibrium. The limiting null distribution of the test statistic and the power of the test are derived...
March 2018: Annals of Applied Statistics
https://www.readbyqxmd.com/read/29731954/msiq-joint-modeling-of-multiple-rna-seq-samples-for-accurate-isoform-quantification
#6
Wei Vivian Li, Anqi Zhao, Shihua Zhang, Jingyi Jessica Li
Next-generation RNA sequencing (RNA-seq) technology has been widely used to assess full-length RNA isoform abundance in a high-throughput manner. RNA-seq data offer insight into gene expression levels and transcriptome structures, enabling us to better understand the regulation of gene expression and fundamental biological processes. Accurate isoform quantification from RNA-seq data is challenging due to the information loss in sequencing experiments. A recent accumulation of multiple RNA-seq data sets from the same tissue or cell type provides new opportunities to improve the accuracy of isoform quantification...
March 2018: Annals of Applied Statistics
https://www.readbyqxmd.com/read/29606991/design-of-vaccine-trials-during-outbreaks-with-and-without-a-delayed-vaccination-comparator
#7
Natalie E Dean, M Elizabeth Halloran, Ira M Longini
Conducting vaccine efficacy trials during outbreaks of emerging pathogens poses particular challenges. The "Ebola ça suffit" trial in Guinea used a novel ring vaccination cluster randomized design to target populations at highest risk of infection. Another key feature of the trial was the use of a delayed vaccination arm as a comparator, in which clusters were randomized to immediate vaccination or vaccination 21 days later. This approach, chosen to improve ethical acceptability of the trial, complicates the statistical analysis as participants in the comparison arm are eventually protected by vaccine...
March 2018: Annals of Applied Statistics
https://www.readbyqxmd.com/read/29515717/a-unified-framework-for-variance-component-estimation-with-summary-statistics-in-genome-wide-association-studies
#8
Xiang Zhou
Linear mixed models (LMMs) are among the most commonly used tools for genetic association studies. However, the standard method for estimating variance components in LMMs-the restricted maximum likelihood estimation method (REML)-suffers from several important drawbacks: REML requires individual-level genotypes and phenotypes from all samples in the study, is computationally slow, and produces downward-biased estimates in case control studies. To remedy these drawbacks, we present an alternative framework for variance component estimation, which we refer to as MQS...
December 2017: Annals of Applied Statistics
https://www.readbyqxmd.com/read/29861820/toward-bayesian-inference-of-the-spatial-distribution-of-proteins-from-three-cube-f%C3%A3-rster-resonance-energy-transfer-data
#9
Jan-Otto Hooghoudt, Margarida Barroso, Rasmus Waagepetersen
Förster resonance energy transfer (FRET) is a quantum-physical phenomenon where energy may be transferred from one molecule to a neighbor molecule if the molecules are close enough. Using fluorophore molecule marking of proteins in a cell, it is possible to measure in microscopic images to what extent FRET takes place between the fluorophores. This provides indirect information of the spatial distribution of the proteins. Questions of particular interest are whether (and if so to which extent) proteins of possibly different types interact or whether they appear independently of each other...
September 2017: Annals of Applied Statistics
https://www.readbyqxmd.com/read/29721127/latent-space-models-for-multiview-network-data
#10
Michael Salter-Townshend, Tyler H McCormick
Social relationships consist of interactions along multiple dimensions. In social networks, this means that individuals form multiple types of relationships with the same person (e.g., an individual will not trust all of his/her acquaintances). Statistical models for these data require understanding two related types of dependence structure: (i) structure within each relationship type, or network view, and (ii) the association between views. In this paper, we propose a statistical framework that parsimoniously represents dependence between relationship types while also maintaining enough flexibility to allow individuals to serve different roles in different relationship types...
September 2017: Annals of Applied Statistics
https://www.readbyqxmd.com/read/29479394/a-novel-and-efficient-algorithm-for-de-novo-discovery-of-mutated-driver-pathways-in-cancer
#11
Binghui Liu, Chong Wu, Xiaotong Shen, Wei Pan
Next-generation sequencing studies on cancer somatic mutations have discovered that driver mutations tend to appear in most tumor samples, but they barely overlap in any single tumor sample, presumably because a single driver mutation can perturb the whole pathway. Based on the corresponding new concepts of coverage and mutual exclusivity, new methods can be designed for de novo discovery of mutated driver pathways in cancer. Since the computational problem is a combinatorial optimization with an objective function involving a discontinuous indicator function in high dimension, many existing optimization algorithms, such as a brute force enumeration, gradient descent and Newton's methods, are practically infeasible or directly inapplicable...
September 2017: Annals of Applied Statistics
https://www.readbyqxmd.com/read/29308102/doubly-robust-estimation-of-optimal-treatment-regimes-for-survival-data-with-application-to-an-hiv-aids-study
#12
Runchao Jiang, Wenbin Lu, Rui Song, Michael G Hudgens, Sonia Naprvavnik
In many biomedical settings, assigning every patient the same treatment may not be optimal due to patient heterogeneity. Individualized treatment regimes have the potential to dramatically improve clinical outcomes. When the primary outcome is censored survival time, a main interest is to find optimal treatment regimes that maximize the survival probability of patients. Since the survival curve is a function of time, it is important to balance short-term and long-term benefit when assigning treatments. In this paper, we propose a doubly robust approach to estimate optimal treatment regimes that optimize a user specified function of the survival curve, including the restricted mean survival time and the median survival time...
September 2017: Annals of Applied Statistics
https://www.readbyqxmd.com/read/29152032/latent-class-modeling-using-matrix-covariates-with-application-to-identifying-early-placebo-responders-based-on-eeg-signals
#13
Bei Jiang, Eva Petkova, Thaddeus Tarpey, R Todd Ogden
Latent class models are widely used to identify unobserved subgroups (i.e., latent classes) based upon one or more manifest variables. The probability of belonging to each subgroup is typically modeled as a function of a set of measured covariates. In this paper, we extend existing latent class models to incorporate matrix covariates. This research is motivated by a randomized placebo-controlled depression clinical trial. One study goal is to identify a subgroup of subjects who experience symptoms improvement early on during antidepressant treatment, which is considered to be an indication of a placebo rather than a true pharmacological response...
September 2017: Annals of Applied Statistics
https://www.readbyqxmd.com/read/29081874/testing-high-dimensional-covariance-matrices-with-application-to-detecting-schizophrenia-risk-genes
#14
Lingxue Zhu, Jing Lei, Bernie Devlin, Kathryn Roeder
Scientists routinely compare gene expression levels in cases versus controls in part to determine genes associated with a disease. Similarly, detecting case-control differences in co-expression among genes can be critical to understanding complex human diseases; however statistical methods have been limited by the high dimensional nature of this problem. In this paper, we construct a sparse-Leading-Eigenvalue-Driven (sLED) test for comparing two high-dimensional covariance matrices. By focusing on the spectrum of the differential matrix, sLED provides a novel perspective that accommodates what we assume to be common, namely sparse and weak signals in gene expression data, and it is closely related with Sparse Principal Component Analysis...
September 2017: Annals of Applied Statistics
https://www.readbyqxmd.com/read/29081873/dynamic-prediction-for-multiple-repeated-measures-and-event-time-data-an-application-to-parkinson-s-disease
#15
Jue Wang, Sheng Luo, Liang Li
In many clinical trials studying neurodegenerative diseases such as Parkinson's disease (PD), multiple longitudinal outcomes are collected to fully explore the multidimensional impairment caused by this disease. If the outcomes deteriorate rapidly, patients may reach a level of functional disability sufficient to initiate levodopa therapy for ameliorating disease symptoms. An accurate prediction of the time to functional disability is helpful for clinicians to monitor patients' disease progression and make informative medical decisions...
September 2017: Annals of Applied Statistics
https://www.readbyqxmd.com/read/29250210/quantification-of-multiple-tumor-clones-using-gene-array-and-sequencing-data
#16
Yichen Cheng, James Y Dai, Thomas G Paulson, Xiaoyu Wang, Xiaohong Li, Brian J Reid, Charles Kooperberg
Cancer development is driven by genomic alterations, including copy number aberrations. The detection of copy number aberrations in tumor cells is often complicated by possible contamination of normal stromal cells in tumor samples and intratumor heterogeneity, namely the presence of multiple clones of tumor cells. In order to correctly quantify copy number aberrations, it is critical to successfully de-convolute the complex structure of the genetic information from tumor samples. In this article, we propose a general Bayesian method for estimating copy number aberrations when there are normal cells and potentially more than one tumor clones...
June 2017: Annals of Applied Statistics
https://www.readbyqxmd.com/read/28989557/allele-specific-copy-number-estimation-by-whole-exome-sequencing
#17
Hao Chen, Yuchao Jiang, Kara N Maxwell, Katherine L Nathanson, Nancy Zhang
Whole exome sequencing is currently a technology of choice in large-scale cancer genomics studies, where the priority is to identify cancer-associated variants in coding regions. We describe a method for estimating allele-specific copy number using whole exome sequencing data from tumor and matched normal.
June 2017: Annals of Applied Statistics
https://www.readbyqxmd.com/read/28959370/integrative-sparse-k-means-with-overlapping-group-lasso-in-genomic-applications-for-disease-subtype-discovery
#18
Zhiguang Huo, George Tseng
Cancer subtypes discovery is the first step to deliver personalized medicine to cancer patients. With the accumulation of massive multi-level omics datasets and established biological knowledge databases, omics data integration with incorporation of rich existing biological knowledge is essential for deciphering a biological mechanism behind the complex diseases. In this manuscript, we propose an integrative sparse K-means (is-K means) approach to discover disease subtypes with the guidance of prior biological knowledge via sparse overlapping group lasso...
June 2017: Annals of Applied Statistics
https://www.readbyqxmd.com/read/28943991/improving-efficiency-in-biomarker-incremental-value-evaluation-under-two-phase-designs
#19
Yingye Zheng, Marshall Brown, Anna Lok, Tianxi Cai
Cost-effective yet efficient designs are critical to the success of biomarker evaluation research. Two-phase sampling designs, under which expensive markers are only measured on a subsample of cases and non-cases within a prospective cohort, are useful in novel biomarker studies for preserving study samples and minimizing cost of biomarker assaying. Statistical methods for quantifying the predictiveness of biomarkers under two-phase studies have been proposed (Cai and Zheng, 2012; Liu, Cai and Zheng, 2012)...
June 2017: Annals of Applied Statistics
https://www.readbyqxmd.com/read/30100948/static-and-roving-sensor-data-fusion-for-spatio-temporal-hazard-mapping-with-application-to-occupational-exposure-assessment
#20
Guilherme Ludwig, Tingjin Chu, Jun Zhu, Haonan Wang, Kirsten Koehler
Rapid technological advances have drastically improved the data collection capacity in occupational exposure assessment. However, advanced statistical methods for analyzing such data and drawing proper inference remain limited. The objectives of this paper are (1) to provide new spatio-temporal methodology that combines data from both roving and static sensors for data processing and hazard mapping across space and over time in an indoor environment, and (2) to compare the new method with the current industry practice, demonstrating the distinct advantages of the new method and the impact on occupational hazard assessment and future policy making in environmental health as well as occupational health...
March 2017: Annals of Applied Statistics
journal
journal
42077
1
2
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read
×

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"