journal
MENU ▼
Read by QxMD icon Read
search

KDD: Proceedings

journal
https://www.readbyqxmd.com/read/30191079/generalized-score-functions-for-causal-discovery
#1
Biwei Huang, Kun Zhang, Yizhu Lin, Bernhard Schölkopf, Clark Glymour
Discovery of causal relationships from observational data is a fundamental problem. Roughly speaking, there are two types of methods for causal discovery, constraint-based ones and score-based ones. Score-based methods avoid the multiple testing problem and enjoy certain advantages compared to constraint-based ones. However, most of them need strong assumptions on the functional forms of causal mechanisms, as well as on data distributions, which limit their applicability. In practice the precise information of the underlying model class is usually unknown...
August 2018: KDD: Proceedings
https://www.readbyqxmd.com/read/30221026/prep-path-based-relevance-from-a-probabilistic-perspective-in-heterogeneous-information-networks
#2
Yu Shi, Po-Wei Chan, Honglei Zhuang, Huan Gui, Jiawei Han
As a powerful representation paradigm for networked and multi-typed data, the heterogeneous information network (HIN) is ubiquitous. Meanwhile, defining proper relevance measures has always been a fundamental problem and of great pragmatic importance for network mining tasks. Inspired by our probabilistic interpretation of existing path-based relevance measures, we propose to study HIN relevance from a probabilistic perspective. We also identify, from real-world data, and propose to model cross-meta-path synergy , which is a characteristic important for defining path-based HIN relevance and has not been modeled by existing methods...
August 2017: KDD: Proceedings
https://www.readbyqxmd.com/read/29780658/the-selective-labels-problem-evaluating-algorithmic-predictions-in-the-presence-of-unobservables
#3
Himabindu Lakkaraju, Jon Kleinberg, Jure Leskovec, Jens Ludwig, Sendhil Mullainathan
Evaluating whether machines improve on human performance is one of the central questions of machine learning. However, there are many domains where the data is selectively labeled in the sense that the observed outcomes are themselves a consequence of the existing choices of the human decision-makers. For instance, in the context of judicial bail decisions, we observe the outcome of whether a defendant fails to return for their court appearance only if the human judge decides to release the defendant on bail...
August 2017: KDD: Proceedings
https://www.readbyqxmd.com/read/29770258/local-higher-order-graph-clustering
#4
Hao Yin, Austin R Benson, Jure Leskovec, David F Gleich
Local graph clustering methods aim to find a cluster of nodes by exploring a small region of the graph. These methods are attractive because they enable targeted clustering around a given seed node and are faster than traditional global graph clustering methods because their runtime does not depend on the size of the input graph. However, current local graph partitioning methods are not designed to account for the higher-order structures crucial to the network, nor can they effectively handle directed networks...
August 2017: KDD: Proceedings
https://www.readbyqxmd.com/read/29770257/toeplitz-inverse-covariance-based-clustering-of-multivariate-time-series-data
#5
David Hallac, Sagar Vare, Stephen Boyd, Jure Leskovec
Subsequence clustering of multivariate time series is a useful tool for discovering repeated patterns in temporal data. Once these patterns have been discovered, seemingly complicated datasets can be interpreted as a temporal sequence of only a small number of states, or clusters . For example, raw sensor data from a fitness-tracking application can be expressed as a timeline of a select few actions ( i.e. , walking, sitting, running). However, discovering these patterns is challenging because it requires simultaneous segmentation and clustering of the time series...
August 2017: KDD: Proceedings
https://www.readbyqxmd.com/read/29770256/network-inference-via-the-time-varying-graphical-lasso
#6
David Hallac, Youngsuk Park, Stephen Boyd, Jure Leskovec
Many important problems can be modeled as a system of interconnected entities, where each entity is recording time-dependent observations or measurements. In order to spot trends, detect anomalies, and interpret the temporal dynamics of such data, it is essential to understand the relationships between the different entities and how these relationships evolve over time. In this paper, we introduce the time-varying graphical lasso (TVGL) , a method of inferring time-varying networks from raw time series data...
August 2017: KDD: Proceedings
https://www.readbyqxmd.com/read/29755826/pharmacovigilance-via-baseline-regularization-with-large-scale-longitudinal-observational-data
#7
Zhaobin Kuang, Peggy Peissig, Vítor Santos Costa, Richard Maclin, David Page
Several prominent public health hazards [29] that occurred at the beginning of this century due to adverse drug events (ADEs) have raised international awareness of governments and industries about pharmacovigilance (PhV) [6,7], the science and activities to monitor and prevent adverse events caused by pharmaceutical products after they are introduced to the market. A major data source for PhV is large-scale longitudinal observational databases (LODs) [6] such as electronic health records (EHRs) and medical insurance claim databases...
August 2017: KDD: Proceedings
https://www.readbyqxmd.com/read/29430330/moliere-automatic-biomedical-hypothesis-generation-system
#8
Justin Sybrandt, Michael Shtutman, Ilya Safro
Hypothesis generation is becoming a crucial time-saving technique which allows biomedical researchers to quickly discover implicit connections between important concepts. Typically, these systems operate on domain-specific fractions of public medical data. MOLIERE, in contrast, utilizes information from over 24.5 million documents. At the heart of our approach lies a multi-modal and multi-relational network of biomedical objects extracted from several heterogeneous datasets from the National Center for Biotechnology Information (NCBI)...
August 2017: KDD: Proceedings
https://www.readbyqxmd.com/read/29333328/learning-tree-structured-detection-cascades-for-heterogeneous-networks-of-embedded-devices
#9
Hamid Dadkhahi, Benjamin M Marlin
In this paper, we present a new approach to learning cascaded classifiers for use in computing environments that involve networks of heterogeneous and resource-constrained, low-power embedded compute and sensing nodes. We present a generalization of the classical linear detection cascade to the case of tree-structured cascades where different branches of the tree execute on different physical compute nodes in the network. Different nodes have access to different features, as well as access to potentially different computation and energy resources...
August 2017: KDD: Proceedings
https://www.readbyqxmd.com/read/29071165/federated-tensor-factorization-for-computational-phenotyping
#10
Yejin Kim, Jimeng Sun, Hwanjo Yu, Xiaoqian Jiang
Tensor factorization models offer an effective approach to convert massive electronic health records into meaningful clinical concepts (phenotypes) for data analysis. These models need a large amount of diverse samples to avoid population bias. An open challenge is how to derive phenotypes jointly across multiple hospitals, in which direct patient-level data sharing is not possible (e.g., due to institutional policies). In this paper, we developed a novel solution to enable federated tensor factorization for computational phenotyping without sharing patient-level data...
August 2017: KDD: Proceedings
https://www.readbyqxmd.com/read/28713636/ranking-causal-anomalies-via-temporal-and-dynamical-analysis-on-vanishing-correlations
#11
Wei Cheng, Kai Zhang, Haifeng Chen, Guofei Jiang, Zhengzhang Chen, Wei Wang
Modern world has witnessed a dramatic increase in our ability to collect, transmit and distribute real-time monitoring and surveillance data from large-scale information systems and cyber-physical systems. Detecting system anomalies thus attracts significant amount of interest in many fields such as security, fault management, and industrial optimization. Recently, invariant network has shown to be a powerful way in characterizing complex system behaviours. In the invariant network, a node represents a system component and an edge indicates a stable, significant interaction between two components...
August 2016: KDD: Proceedings
https://www.readbyqxmd.com/read/28580192/fast-component-pursuit-for-large-scale-inverse-covariance-estimation
#12
Lei Han, Yu Zhang, Tong Zhang
The maximum likelihood estimation (MLE) for the Gaussian graphical model, which is also known as the inverse covariance estimation problem, has gained increasing interest recently. Most existing works assume that inverse covariance estimators contain sparse structure and then construct models with the ℓ1 regularization. In this paper, different from existing works, we study the inverse covariance estimation problem from another perspective by efficiently modeling the low-rank structure in the inverse covariance, which is assumed to be a combination of a low-rank part and a diagonal matrix...
August 2016: KDD: Proceedings
https://www.readbyqxmd.com/read/28392970/generalized-hierarchical-sparse-model-for-arbitrary-order-interactive-antigenic-sites-identification-in-flu-virus-data
#13
Lei Han, Yu Zhang, Xiu-Feng Wan, Tong Zhang
Recent statistical evidence has shown that a regression model by incorporating the interactions among the original covariates/features can significantly improve the interpretability for biological data. One major challenge is the exponentially expanded feature space when adding high-order feature interactions to the model. To tackle the huge dimensionality, hierarchical sparse models (HSM) are developed by enforcing sparsity under heredity structures in the interactions among the covariates. However, existing methods only consider pairwise interactions, making the discovery of important high-order interactions a non-trivial open problem...
August 2016: KDD: Proceedings
https://www.readbyqxmd.com/read/28316874/computational-drug-repositioning-using-continuous-self-controlled-case-series
#14
Zhaobin Kuang, James Thomson, Michael Caldwell, Peggy Peissig, Ron Stewart, David Page
Computational Drug Repositioning (CDR) is the task of discovering potential new indications for existing drugs by mining large-scale heterogeneous drug-related data sources. Leveraging the patient-level temporal ordering information between numeric physiological measurements and various drug prescriptions provided in Electronic Health Records (EHRs), we propose a Continuous Self-controlled Case Series (CSCCS) model for CDR. As an initial evaluation, we look for drugs that can control Fasting Blood Glucose (FBG) level in our experiments...
August 2016: KDD: Proceedings
https://www.readbyqxmd.com/read/28203486/dynamics-of-large-multi-view-social-networks-synergy-cannibalization-and-cross-view-interplay
#15
Yu Shi, Myunghwan Kim, Shaunak Chatterjee, Mitul Tiwari, Souvik Ghosh, Rómer Rosales
Most social networking services support multiple types of relationships between users, such as getting connected, sending messages, and consuming feed updates. These users and relationships can be naturally represented as a dynamic multi-view network, which is a set of weighted graphs with shared common nodes but having their own respective edges. Different network views, representing structural relationship and interaction types, could have very distinctive properties individually and these properties may change due to interplay across views...
August 2016: KDD: Proceedings
https://www.readbyqxmd.com/read/28180028/squish-near-optimal-compression-for-archival-of-relational-datasets
#16
Yihan Gao, Aditya Parameswaran
Relational datasets are being generated at an alarmingly rapid rate across organizations and industries. Compressing these datasets could significantly reduce storage and archival costs. Traditional compression algorithms, e.g., gzip, are suboptimal for compressing relational datasets since they ignore the table structure and relationships between attributes. We study compression algorithms that leverage the relational structure to compress datasets to a much greater extent. We develop Squish, a system that uses a combination of Bayesian Networks and Arithmetic Coding to capture multiple kinds of dependencies among attributes and achieve near-entropy compression rate...
August 2016: KDD: Proceedings
https://www.readbyqxmd.com/read/28163978/gmove-group-level-mobility-modeling-using-geo-tagged-social-media
#17
Chao Zhang, Keyang Zhang, Quan Yuan, Luming Zhang, Tim Hanratty, Jiawei Han
Understanding human mobility is of great importance to various applications, such as urban planning, traffic scheduling, and location prediction. While there has been fruitful research on modeling human mobility using tracking data (e.g., GPS traces), the recent growth of geo-tagged social media (GeoSM) brings new opportunities to this task because of its sheer size and multi-dimensional nature. Nevertheless, how to obtain quality mobility models from the highly sparse and complex GeoSM data remains a challenge that cannot be readily addressed by existing techniques...
August 2016: KDD: Proceedings
https://www.readbyqxmd.com/read/27853627/interpretable-decision-sets-a-joint-framework-for-description-and-prediction
#18
Himabindu Lakkaraju, Stephen H Bach, Leskovec Jure
One of the most important obstacles to deploying predictive models is the fact that humans do not understand and trust them. Knowing which variables are important in a model's prediction and how they are combined can be very powerful in helping people understand and trust automatic decision making systems. Here we propose interpretable decision sets, a framework for building predictive models that are highly accurate, yet also highly interpretable. Decision sets are sets of independent if-then rules. Because each rule can be applied independently, decision sets are simple, concise, and easily interpretable...
August 2016: KDD: Proceedings
https://www.readbyqxmd.com/read/27853626/node2vec-scalable-feature-learning-for-networks
#19
Aditya Grover, Jure Leskovec
Prediction tasks over nodes and edges in networks require careful effort in engineering features used by learning algorithms. Recent research in the broader field of representation learning has led to significant progress in automating prediction by learning the features themselves. However, present feature learning approaches are not expressive enough to capture the diversity of connectivity patterns observed in networks. Here we propose node2vec, an algorithmic framework for learning continuous feature representations for nodes in networks...
August 2016: KDD: Proceedings
https://www.readbyqxmd.com/read/27747132/batch-model-for-batched-timestamps-data-analysis-with-application-to-the-ssa-disability-program
#20
Qingqi Yue, Ao Yuan, Xuan Che, Minh Huynh, Chunxiao Zhou
The Office of Disability Adjudication and Review (ODAR) is responsible for holding hearings, issuing decisions, and reviewing appeals as part of the Social Security Administration's disability determining process. In order to control and process cases, the ODAR has established a Case Processing and Management System (CPMS) to record management information since December 2003. The CPMS provides a detailed case status history for each case. Due to the large number of appeal requests and limited resources, the number of pending claims at ODAR was over one million cases by March 31, 2015...
August 2016: KDD: Proceedings
journal
journal
47001
1
2
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read
×

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"