journal
Journals Journal of Machine Learning Re...

Journal of Machine Learning Research : JMLR

https://read.qxmd.com/read/35891979/spatial-multivariate-trees-for-big-data-bayesian-regression
#21
JOURNAL ARTICLE
Michele Peruzzi, David B Dunson
High resolution geospatial data are challenging because standard geostatistical models based on Gaussian processes are known to not scale to large data sizes. While progress has been made towards methods that can be computed more efficiently, considerably less attention has been devoted to methods for large scale data that allow the description of complex relationships between several outcomes recorded at high resolutions by different sensors. Our Bayesian multivariate regression models based on spatial multivariate trees (SpamTrees) achieve scalability via conditional independence assumptions on latent random effects following a treed directed acyclic graph...
2022: Journal of Machine Learning Research: JMLR
https://read.qxmd.com/read/34566520/hoeffding-s-inequality-for-general-markov-chains-with-its-applications-to-statistical-learning
#22
JOURNAL ARTICLE
Jianqing Fan, Bai Jiang, Qiang Sun
This paper establishes Hoeffding's lemma and inequality for bounded functions of general-state-space and not necessarily reversible Markov chains. The sharpness of these results is characterized by the optimality of the ratio between variance proxies in the Markov-dependent and independent settings. The boundedness of functions is shown necessary for such results to hold in general. To showcase the usefulness of the new results, we apply them for non-asymptotic analyses of MCMC estimation, respondent-driven sampling and high-dimensional covariance matrix estimation on time series data with a Markovian nature...
August 2021: Journal of Machine Learning Research: JMLR
https://read.qxmd.com/read/35321091/a-flexible-model-free-prediction-based-framework-for-feature-ranking
#23
JOURNAL ARTICLE
Jingyi Jessica Li, Yiling Elaine Chen, Xin Tong
Despite the availability of numerous statistical and machine learning tools for joint feature modeling, many scientists investigate features marginally, i.e., one feature at a time. This is partly due to training and convention but also roots in scientists' strong interests in simple visualization and interpretability. As such, marginal feature ranking for some predictive tasks, e.g., prediction of cancer driver genes, is widely practiced in the process of scientific discoveries. In this work, we focus on marginal ranking for binary classification, one of the most common predictive tasks...
May 2021: Journal of Machine Learning Research: JMLR
https://read.qxmd.com/read/37426040/integrative-high-dimensional-multiple-testing-with-heterogeneity-under-data-sharing-constraints
#24
JOURNAL ARTICLE
Molei Liu, Yin Xia, Kelly Cho, Tianxi Cai
Identifying informative predictors in a high dimensional regression model is a critical step for association analysis and predictive modeling. Signal detection in the high dimensional setting often fails due to the limited sample size. One approach to improving power is through meta-analyzing multiple studies which address the same scientific question. However, integrative analysis of high dimensional data from multiple studies is challenging in the presence of between-study heterogeneity. The challenge is even more pronounced with additional data sharing constraints under which only summary data can be shared across different sites...
April 2021: Journal of Machine Learning Research: JMLR
https://read.qxmd.com/read/34650343/inference-for-multiple-heterogeneous-networks-with-a-common-invariant-subspace
#25
JOURNAL ARTICLE
Jesús Arroyo, Avanti Athreya, Joshua Cape, Guodong Chen, Carey E Priebe, Joshua T Vogelstein
The development of models and methodology for the analysis of data from multiple heterogeneous networks is of importance both in statistical network theory and across a wide spectrum of application domains. Although single-graph analysis is well-studied, multiple graph inference is largely unexplored, in part because of the challenges inherent in appropriately modeling graph differences and yet retaining sufficient model simplicity to render estimation feasible. This paper addresses exactly this gap, by introducing a new model, the common subspace independent-edge multiple random graph model, which describes a heterogeneous collection of networks with a shared latent structure on the vertices but potentially different connectivity patterns for each graph...
March 2021: Journal of Machine Learning Research: JMLR
https://read.qxmd.com/read/38476310/adversarial-monte-carlo-meta-learning-of-optimal-prediction-procedures
#26
JOURNAL ARTICLE
Alex Luedtke, Incheoul Chung, Oleg Sofrygin
We frame the meta-learning of prediction procedures as a search for an optimal strategy in a two-player game. In this game, Nature selects a prior over distributions that generate labeled data consisting of features and an associated outcome, and the Predictor observes data sampled from a distribution drawn from this prior. The Predictor's objective is to learn a function that maps from a new feature to an estimate of the associated outcome. We establish that, under reasonable conditions, the Predictor has an optimal strategy that is equivariant to shifts and rescalings of the outcome and is invariant to permutations of the observations and to shifts, rescalings, and permutations of the features...
2021: Journal of Machine Learning Research: JMLR
https://read.qxmd.com/read/38149302/flexible-signal-denoising-via-flexible-empirical-bayes-shrinkage
#27
JOURNAL ARTICLE
Zhengrong Xing, Peter Carbonetto, Matthew Stephens
Signal denoising-also known as non-parametric regression-is often performed through shrinkage estimation in a transformed (e.g., wavelet) domain; shrinkage in the transformed domain corresponds to smoothing in the original domain. A key question in such applications is how much to shrink, or, equivalently, how much to smooth. Empirical Bayes shrinkage methods provide an attractive solution to this problem; they use the data to estimate a distribution of underlying "effects," hence automatically select an appropriate amount of shrinkage...
2021: Journal of Machine Learning Research: JMLR
https://read.qxmd.com/read/37920532/empirical-bayes-matrix-factorization
#28
JOURNAL ARTICLE
Wei Wang, Matthew Stephens
Matrix factorization methods, which include Factor analysis (FA) and Principal Components Analysis (PCA), are widely used for inferring and summarizing structure in multivariate data. Many such methods use a penalty or prior distribution to achieve sparse representations ("Sparse FA/PCA"), and a key question is how much sparsity to induce. Here we introduce a general Empirical Bayes approach to matrix factorization (EBMF), whose key feature is that it estimates the appropriate amount of sparsity by estimating prior distributions from the observed data...
2021: Journal of Machine Learning Research: JMLR
https://read.qxmd.com/read/35935001/inference-for-the-case-probability-in-high-dimensional-logistic-regression
#29
JOURNAL ARTICLE
Zijian Guo, Prabrisha Rakshit, Daniel S Herman, Jinbo Chen
Labeling patients in electronic health records with respect to their statuses of having a disease or condition, i.e. case or control statuses, has increasingly relied on prediction models using high-dimensional variables derived from structured and unstructured electronic health record data. A major hurdle currently is a lack of valid statistical inference methods for the case probability. In this paper, considering high-dimensional sparse logistic regression models for prediction, we propose a novel bias-corrected estimator for the case probability through the development of linearization and variance enhancement techniques...
2021: Journal of Machine Learning Research: JMLR
https://read.qxmd.com/read/35873072/ldle-low-distortion-local-eigenmaps
#30
JOURNAL ARTICLE
Dhruv Kohli, Alexander Cloninger, Gal Mishne
We present Low Distortion Local Eigenmaps (LDLE), a manifold learning technique which constructs a set of low distortion local views of a data set in lower dimension and registers them to obtain a global embedding. The local views are constructed using the global eigenvectors of the graph Laplacian and are registered using Procrustes analysis. The choice of these eigenvectors may vary across the regions. In contrast to existing techniques, LDLE can embed closed and non-orientable manifolds into their intrinsic dimension by tearing them apart...
January 2021: Journal of Machine Learning Research: JMLR
https://read.qxmd.com/read/35782785/bayesian-distance-clustering
#31
JOURNAL ARTICLE
Leo L Duan, David B Dunson
Model-based clustering is widely used in a variety of application areas. However, fundamental concerns remain about robustness. In particular, results can be sensitive to the choice of kernel representing the within-cluster data density. Leveraging on properties of pairwise differences between data points, we propose a class of Bayesian distance clustering methods, which rely on modeling the likelihood of the pairwise distances in place of the original data. Although some information in the data is discarded, we gain substantial robustness to modeling assumptions...
January 2021: Journal of Machine Learning Research: JMLR
https://read.qxmd.com/read/35754924/soft-tensor-regression
#32
JOURNAL ARTICLE
Georgia Papadogeorgou, Zhengwu Zhang, David B Dunson
Statistical methods relating tensor predictors to scalar outcomes in a regression model generally vectorize the tensor predictor and estimate the coefficients of its entries employing some form of regularization, use summaries of the tensor covariate, or use a low dimensional approximation of the coefficient tensor. However, low rank approximations of the coefficient tensor can suffer if the true rank is not small. We propose a tensor regression framework which assumes a soft version of the parallel factors (PARAFAC) approximation...
January 2021: Journal of Machine Learning Research: JMLR
https://read.qxmd.com/read/35754923/estimating-uncertainty-intervals-from-collaborating-networks
#33
JOURNAL ARTICLE
Tianhui Zhou, Yitong Li, Yuan Wu, David Carlson
Effective decision making requires understanding the uncertainty inherent in a prediction. In regression, this uncertainty can be estimated by a variety of methods; however, many of these methods are laborious to tune, generate overconfident uncertainty intervals, or lack sharpness (give imprecise intervals). We address these challenges by proposing a novel method to capture predictive distributions in regression by defining two neural networks with two distinct loss functions. Specifically, one network approximates the cumulative distribution function, and the second network approximates its inverse...
January 2021: Journal of Machine Learning Research: JMLR
https://read.qxmd.com/read/35754922/bayesian-time-aligned-factor-analysis-of-paired-multivariate-time-series
#34
JOURNAL ARTICLE
Arkaprava Roy, Jana Schaich Borg, David B Dunson
Many modern data sets require inference methods that can estimate the shared and individual-specific components of variability in collections of matrices that change over time. Promising methods have been developed to analyze these types of data in static cases, but only a few approaches are available for dynamic settings. To address this gap, we consider novel models and inference methods for pairs of matrices in which the columns correspond to multivariate observations at different time points. In order to characterize common and individual features, we propose a Bayesian dynamic factor modeling framework called Time Aligned Common and Individual Factor Analysis (TACIFA) that includes uncertainty in time alignment through an unknown warping function...
January 2021: Journal of Machine Learning Research: JMLR
https://read.qxmd.com/read/35002545/learning-and-planning-for-time-varying-mdps-using-maximum-likelihood-estimation
#35
JOURNAL ARTICLE
Melkior Ornik, Ufuk Topcu
This paper proposes a formal approach to online learning and planning for agents operating in a priori unknown, time-varying environments. The proposed method computes the maximally likely model of the environment, given the observations about the environment made by an agent earlier in the system run and assuming knowledge of a bound on the maximal rate of change of system dynamics. Such an approach generalizes the estimation method commonly used in learning algorithms for unknown Markov decision processes with time-invariant transition probabilities, but is also able to quickly and correctly identify the system dynamics following a change...
2021: Journal of Machine Learning Research: JMLR
https://read.qxmd.com/read/34744522/integrative-generalized-convex-clustering-optimization-and-feature-selection-for-mixed-multi-view-data
#36
JOURNAL ARTICLE
Minjie Wang, Genevera I Allen
In mixed multi-view data, multiple sets of diverse features are measured on the same set of samples. By integrating all available data sources, we seek to discover common group structure among the samples that may be hidden in individualistic cluster analyses of a single data view. While several techniques for such integrative clustering have been explored, we propose and develop a convex formalization that enjoys strong empirical performance and inherits the mathematical properties of increasingly popular convex clustering methods...
January 2021: Journal of Machine Learning Research: JMLR
https://read.qxmd.com/read/34733120/estimation-and-optimization-of-composite-outcomes
#37
JOURNAL ARTICLE
Daniel J Luckett, Eric B Laber, Siyeon Kim, Michael R Kosorok
There is tremendous interest in precision medicine as a means to improve patient outcomes by tailoring treatment to individual characteristics. An individualized treatment rule formalizes precision medicine as a map from patient information to a recommended treatment. A treatment rule is defined to be optimal if it maximizes the mean of a scalar outcome in a population of interest, e.g., symptom reduction. However, clinical and intervention scientists often seek to balance multiple and possibly competing outcomes, e...
January 2021: Journal of Machine Learning Research: JMLR
https://read.qxmd.com/read/34531706/estimation-and-inference-for-high-dimensional-generalized-linear-models-a-splitting-and-smoothing-approach
#38
JOURNAL ARTICLE
Zhe Fei, Yi Li
The focus of modern biomedical studies has gradually shifted to explanation and estimation of joint effects of high dimensional predictors on disease risks. Quantifying uncertainty in these estimates may provide valuable insight into prevention strategies or treatment decisions for both patients and physicians. High dimensional inference, including confidence intervals and hypothesis testing, has sparked much interest. While much work has been done in the linear regression setting, there is lack of literature on inference for high dimensional generalized linear models...
2021: Journal of Machine Learning Research: JMLR
https://read.qxmd.com/read/33488299/nonparametric-graphical-model-for-counts
#39
JOURNAL ARTICLE
Arkaprava Roy, David B Dunson
Although multivariate count data are routinely collected in many application areas, there is surprisingly little work developing flexible models for characterizing their dependence structure. This is particularly true when interest focuses on inferring the conditional independence graph. In this article, we propose a new class of pairwise Markov random field-type models for the joint distribution of a multivariate count vector. By employing a novel type of transformation, we avoid restricting to non-negative dependence structures or inducing other restrictions through truncations...
December 2020: Journal of Machine Learning Research: JMLR
https://read.qxmd.com/read/34557057/learning-from-binary-multiway-data-probabilistic-tensor-decomposition-and-its-statistical-optimality
#40
JOURNAL ARTICLE
Miaoyan Wang, Lexin Li
We consider the problem of decomposing a higher-order tensor with binary entries. Such data problems arise frequently in applications such as neuroimaging, recommendation system, topic modeling, and sensor network localization. We propose a multilinear Bernoulli model, develop a rank-constrained likelihood-based estimation method, and obtain the theoretical accuracy guarantees. In contrast to continuous-valued problems, the binary tensor problem exhibits an interesting phase transition phenomenon according to the signal-to-noise ratio...
July 2020: Journal of Machine Learning Research: JMLR
journal
journal
46647
2
3
Fetch more papers »
Fetching more papers... Fetching...
Remove bar
Read by QxMD icon Read
×

Save your favorite articles in one place with a free QxMD account.

×

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"

We want to hear from doctors like you!

Take a second to answer a survey question.