Papers in the journal Journal of the American Statistical Association (Page 2)

#21

JOURNAL ARTICLE

On Robustness of Individualized Decision Rules.

Zhengling Qi, Jong-Shi Pang, Yufeng Liu

With the emergence of precision medicine, estimating optimal individualized decision rules (IDRs) has attracted tremendous attention in many scientific areas. Most existing literature has focused on finding optimal IDRs that can maximize the expected outcome for each individual. Motivated by complex individualized decision making procedures and the popular conditional value at risk (CVaR) measure, we propose a new robust criterion to estimate optimal IDRs in order to control the average lower tail of the individuals' outcomes...

38143785

2023: Journal of the American Statistical Association

#22

JOURNAL ARTICLE

Generalized Liquid Association Analysis for Multimodal Data Integration.

Lexin Li, Jing Zeng, Xin Zhang

Multimodal data are now prevailing in scientific research. One of the central questions in multimodal integrative analysis is to understand how two data modalities associate and interact with each other given another modality or demographic variables. The problem can be formulated as studying the associations among three sets of random variables, a question that has received relatively less attention in the literature. In this article, we propose a novel generalized liquid association analysis method, which offers a new and unique angle to this important class of problems of studying three-way associations...

38099062

2023: Journal of the American Statistical Association

#23

JOURNAL ARTICLE

Assessing disparities in Americans' exposure to PCBs and PBDEs based on NHANES pooled biomonitoring data.

Yan Liu, Dewei Wang, Li Li, Dingsheng Li

The National Health and Nutrition Examination Survey (NHANES) has been continuously biomonitoring Americans' exposure to two families of harmful environmental chemicals: polychlorinated biphenyls (PCBs) and polybrominated diphenyl ethers (PBDEs). However, biomonitoring these chemicals is expensive. To save cost, in 2005, NHANES resorted to pooled biomonitoring; i.e., amalgamating individual specimens to form a pool and measuring chemical levels from pools. Despite being publicly available, these pooled data gain limited applications in health studies...

38046816

2023: Journal of the American Statistical Association

#24

JOURNAL ARTICLE

Ian Laga, Le Bao, Xiaoyue Niu

Aggregated relational data (ARD), formed from "How many X's do you know?" questions, is a powerful tool for learning important network characteristics with incomplete network data. Compared to traditional survey methods, ARD is attractive as it does not require a sample from the target population and does not ask respondents to self-reveal their own status. This is helpful for studying hard-to-reach populations like female sex workers who may be hesitant to reveal their status. From December 2008 to February 2009, the Kiev International Institute of Sociology (KIIS) collected ARD from 10,866 respondents to estimate the size of HIV-related groups in Ukraine...

37997574

2023: Journal of the American Statistical Association

#25

JOURNAL ARTICLE

Genetic underpinnings of brain structural connectome for young adults.

Yize Zhao, Changgee Chang, Jingwen Zhang, Zhengwu Zhang

With distinct advantages in power over behavioral phenotypes, brain imaging traits have become emerging endophenotypes to dissect molecular contributions to behaviors and neuropsychiatric illnesses. Among different imaging features, brain structural connectivity (i.e., structural connectome) which summarizes the anatomical connections between different brain regions is one of the most cutting edge while under-investigated traits; and the genetic influence on the structural connectome variation remains highly elusive...

37982009

2023: Journal of the American Statistical Association

#26

JOURNAL ARTICLE

A general framework for inference on algorithm-agnostic variable importance.

Brian D Williamson, Peter B Gilbert, Noah R Simon, Marco Carone

In many applications, it is of interest to assess the relative contribution of features (or subsets of features) toward the goal of predicting a response - in other words, to gauge the variable importance of features. Most recent work on variable importance assessment has focused on describing the importance of features within the confines of a given prediction algorithm. However, such assessment does not necessarily characterize the prediction potential of features, and may provide a misleading reflection of the intrinsic value of these features...

37982008

2023: Journal of the American Statistical Association

#27

JOURNAL ARTICLE

Causal Inference in Transcriptome-Wide Association Studies with Invalid Instruments and GWAS Summary Data.

Haoran Xue, Xiaotong Shen, Wei Pan

Transcriptome-wide association studies (TWAS) have recently emerged as a popular tool to discover (putative) causal genes by integrating an outcome GWAS dataset with another gene expression/transcriptome GWAS (called eQTL) dataset. In our motivating and target application, we'd like to identify causal genes for low-density lipoprotein cholesterol (LDL), which is crucial for developing new treatments for hyperlipidemia and cardiovascular diseases. The statistical principle underlying TWAS is (two-sample) two-stage least squares (2SLS) using multiple correlated SNPs as instrumental variables (IVs); it is closely related to typical (two-sample) Mendelian randomization (MR) using independent SNPs as IVs, which is expected to be impractical and lower-powered for TWAS (and some other) applications...

37808547

2023: Journal of the American Statistical Association

#28

JOURNAL ARTICLE

Tukey's Depth for Object Data.

Xiongtao Dai, Sara Lopez-Pintado

We develop a novel exploratory tool for non-Euclidean object data based on data depth, extending celebrated Tukey's depth for Euclidean data. The proposed metric halfspace depth, applicable to data objects in a general metric space, assigns to data points depth values that characterize the centrality of these points with respect to the distribution and provides an interpretable center-outward ranking. Desirable theoretical properties that generalize standard depth properties postulated for Euclidean data are established for the metric halfspace depth...

37791295

2023: Journal of the American Statistical Association

#29

COMMENT

Discussion of "LESA: Longitudinal Elastic Shape Analysis of Brain Subcortical Structures".

Moo K Chung, Jamie L Hanson, Richard J Davidson, Seth D Pollak

No abstract text is available yet for this article.

37781353

2023: Journal of the American Statistical Association

#30

JOURNAL ARTICLE

Sparse Topic Modeling: Computational Efficiency, Near-Optimal Algorithms, and Statistical Inference.

Ruijia Wu, Linjun Zhang, T Tony Cai

Sparse topic modeling under the probabilistic latent semantic indexing (pLSI) model is studied. Novel and computationally fast algorithms for estimation and inference of both the word-topic matrix and the topic-document matrix are proposed and their theoretical properties are investigated. Both minimax upper and lower bounds are established and the results show that the proposed algorithms are rate-optimal, up to a logarithmic factor. Moreover, a refitting algorithm is proposed to establish asymptotic normality and construct valid confidence intervals for the individual entries of the word-topic and topic-document matrices...

37771513

2023: Journal of the American Statistical Association

#31

JOURNAL ARTICLE

Multifile Partitioning for Record Linkage and Duplicate Detection.

Serge Aleshin-Guendel, Mauricio Sadinle

Merging datafiles containing information on overlapping sets of entities is a challenging task in the absence of unique identifiers, and is further complicated when some entities are duplicated in the datafiles. Most approaches to this problem have focused on linking two files assumed to be free of duplicates, or on detecting which records in a single file are duplicates. However, it is common in practice to encounter scenarios that fit somewhere in between or beyond these two settings. We propose a Bayesian approach for the general setting of multifile record linkage and duplicate detection...

37771512

2023: Journal of the American Statistical Association

#32

JOURNAL ARTICLE

Time-to-Event Analysis with Unknown Time Origins via Longitudinal Biomarker Registration.

Tianhao Wang, Sarah J Ratcliffe, Wensheng Guo

In observational studies, the time origin of interest for time-to-event analysis is often unknown, such as the time of disease onset. Existing approaches to estimating the time origins are commonly built on extrapolating a parametric longitudinal model, which rely on rigid assumptions that can lead to biased inferences. In this paper, we introduce a flexible semiparametric curve registration model. It assumes the longitudinal trajectories follow a flexible common shape function with person-specific disease progression pattern characterized by a random curve registration function, which is further used to model the unknown time origin as a random start time...

37771511

2023: Journal of the American Statistical Association

#33

JOURNAL ARTICLE

Real-Time Regression Analysis of Streaming Clustered Data With Possible Abnormal Data Batches.

Lan Luo, Ling Zhou, Peter X-K Song

This paper develops an incremental learning algorithm based on quadratic inference function (QIF) to analyze streaming datasets with correlated outcomes such as longitudinal data and clustered data. We propose a renewable QIF (RenewQIF) method within a paradigm of renewable estimation and incremental inference, in which parameter estimates are recursively renewed with current data and summary statistics of historical data, but with no use of any historical subject-level raw data. We compare our renewable estimation method with both offline QIF and offline generalized estimating equations (GEE) approach that process the entire cumulative subject-level data all together, and show theoretically and numerically that our renewable procedure enjoys statistical and computational efficiency...

37771510

2023: Journal of the American Statistical Association

#34

JOURNAL ARTICLE

Orthogonalized Kernel Debiased Machine Learning for Multimodal Data Analysis.

Xiaowu Dai, Lexin Li

Multimodal imaging has transformed neuroscience research. While it presents unprecedented opportunities, it also imposes serious challenges. Particularly, it is difficult to combine the merits of the interpretability attributed to a simple association model with the flexibility achieved by a highly adaptive nonlinear model. In this article, we propose an orthogonalized kernel debiased machine learning approach, which is built upon the Neyman orthogonality and a form of decomposition orthogonality, for multimodal data analysis...

37771509

2023: Journal of the American Statistical Association

#35

JOURNAL ARTICLE

Multivariate Temporal Point Process Regression.

Xiwei Tang, Lexin Li

Point process modeling is gaining increasing attention, as point process type data are emerging in a large variety of scientific applications. In this article, motivated by a neuronal spike trains study, we propose a novel point process regression model, where both the response and the predictor can be a high-dimensional point process. We model the predictor effects through the conditional intensities using a set of basis transferring functions in a convolutional fashion. We organize the corresponding transferring coefficients in the form of a three-way tensor, then impose the low-rank, sparsity, and subgroup structures on this coefficient tensor...

37519438

2023: Journal of the American Statistical Association

#36

JOURNAL ARTICLE

Feature Screening for Interval-Valued Response with Application to Study Association between Posted Salary and Required Skills.

Wei Zhong, Chen Qian, Wanjun Liu, Liping Zhu, Runze Li

It is important to quantify the differences in returns to skills using the online job advertisements data, which have attracted great interest in both labor economics and statistics fields. In this paper, we study the relationship between the posted salary and the job requirements in online labor markets. There are two challenges to deal with. First, the posted salary is always presented in an interval-valued form, for example, 5k-10k yuan per month. Simply taking the mid-point or the lower bound as the alternative for salary may result in biased estimators...

37448462

2023: Journal of the American Statistical Association

#37

JOURNAL ARTICLE

iProMix : A mixture model for studying the function of ACE2 based on bulk proteogenomic data.

Xiaoyu Song, Jiayi Ji, Pei Wang

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has caused over six million deaths in the ongoing COVID-19 pandemic. SARS-CoV-2 uses ACE2 protein to enter human cells, raising a pressing need to characterize proteins/pathways interacted with ACE2. Large-scale proteomic profiling technology is not mature at single-cell resolution to examine the protein activities in disease-relevant cell types. We propose iProMix , a novel statistical framework to identify epithelial-cell specific associations between ACE2 and other proteins/pathways with bulk proteomic data...

37409267

2023: Journal of the American Statistical Association

#38

JOURNAL ARTICLE

Statistical Inference for High-Dimensional Generalized Linear Models with Binary Outcomes.

T Tony Cai, Zijian Guo, Rong Ma

This paper develops a unified statistical inference framework for high-dimensional binary generalized linear models (GLMs) with general link functions. Both unknown and known design distribution settings are considered. A two-step weighted bias-correction method is proposed for constructing confidence intervals and simultaneous hypothesis tests for individual components of the regression vector. Minimax lower bound for the expected length is established and the proposed confidence intervals are shown to be rate-optimal up to a logarithmic factor...

37366472

2023: Journal of the American Statistical Association

#39

JOURNAL ARTICLE

Communication-Efficient Accurate Statistical Estimation.

Jianqing Fan, Yongyi Guo, Kaizheng Wang

When the data are stored in a distributed manner, direct applications of traditional statistical inference procedures are often prohibitive due to communication costs and privacy concerns. This paper develops and investigates two Communication-Efficient Accurate Statistical Estimators (CEASE), implemented through iterative algorithms for distributed optimization. In each iteration, node machines carry out computation in parallel and communicate with the central processor, which then broadcasts aggregated information to node machines for new updates...

37347088

2023: Journal of the American Statistical Association

#40

JOURNAL ARTICLE

Matching One Sample According to Two Criteria in Observational Studies.

B Zhang, D S Small, K B Lasater, M McHugh, J H Silber, P R Rosenbaum

Multivariate matching has two goals: (i) to construct treated and control groups that have similar distributions of observed covariates, and (ii) to produce matched pairs or sets that are homogeneous in a few key covariates. When there are only a few binary covariates, both goals may be achieved by matching exactly for these few covariates. Commonly, however, there are many covariates, so goals (i) and (ii) come apart, and must be achieved by different means. As is also true in a randomized experiment, similar distributions can be achieved for a high-dimensional covariate, but close pairs can be achieved for only a few covariates...

37347087

2023: Journal of the American Statistical Association

Use the journals feature with a free QxMD account.

Journal of the American Statistical Association

Save your favorite articles in one place with a free QxMD account.

Read

Search Tips