We have located links that may give you full text access.
Journal Article
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
An ontology-guided semantic data integration framework to support integrative data analysis of cancer survival.
BMC Medical Informatics and Decision Making 2018 July 24
BACKGROUND: Cancer is the second leading cause of death in the United States, exceeded only by heart disease. Extant cancer survival analyses have primarily focused on individual-level factors due to limited data availability from a single data source. There is a need to integrate data from different sources to simultaneously study as much risk factors as possible. Thus, we proposed an ontology-based approach to integrate heterogeneous datasets addressing key data integration challenges.
METHODS: Following best practices in ontology engineering, we created the Ontology for Cancer Research Variables (OCRV) adapting existing semantic resources such as the National Cancer Institute (NCI) Thesaurus. Using the global-as-view data integration approach, we created mapping axioms to link the data elements in different sources to OCRV. Implemented upon the Ontop platform, we built a data integration pipeline to query, extract, and transform data in relational databases using semantic queries into a pooled dataset according to the downstream multi-level Integrative Data Analysis (IDA) needs.
RESULTS: Based on our use cases in the cancer survival IDA, we created tailored ontological structures in OCRV to facilitate the data integration tasks. Specifically, we created a flexible framework addressing key integration challenges: (1) using a shared, controlled vocabulary to make data understandable to both human and computers, (2) explicitly modeling the semantic relationships makes it possible to compute and reason with the data, (3) linking patients to contextual and environmental factors through geographic variables, (4) being able to document the data manipulation and integration processes clearly in the ontologies.
CONCLUSIONS: Using an ontology-based data integration approach not only standardizes the definitions of data variables through a common, controlled vocabulary, but also makes the semantic relationships among variables from different sources explicit and clear to all users of the same datasets. Such an approach resolves the ambiguity in variable selection, extraction and integration processes and thus improve reproducibility of the IDA.
METHODS: Following best practices in ontology engineering, we created the Ontology for Cancer Research Variables (OCRV) adapting existing semantic resources such as the National Cancer Institute (NCI) Thesaurus. Using the global-as-view data integration approach, we created mapping axioms to link the data elements in different sources to OCRV. Implemented upon the Ontop platform, we built a data integration pipeline to query, extract, and transform data in relational databases using semantic queries into a pooled dataset according to the downstream multi-level Integrative Data Analysis (IDA) needs.
RESULTS: Based on our use cases in the cancer survival IDA, we created tailored ontological structures in OCRV to facilitate the data integration tasks. Specifically, we created a flexible framework addressing key integration challenges: (1) using a shared, controlled vocabulary to make data understandable to both human and computers, (2) explicitly modeling the semantic relationships makes it possible to compute and reason with the data, (3) linking patients to contextual and environmental factors through geographic variables, (4) being able to document the data manipulation and integration processes clearly in the ontologies.
CONCLUSIONS: Using an ontology-based data integration approach not only standardizes the definitions of data variables through a common, controlled vocabulary, but also makes the semantic relationships among variables from different sources explicit and clear to all users of the same datasets. Such an approach resolves the ambiguity in variable selection, extraction and integration processes and thus improve reproducibility of the IDA.
Full text links
Related Resources
Trending Papers
Challenges in Septic Shock: From New Hemodynamics to Blood Purification Therapies.Journal of Personalized Medicine 2024 Februrary 4
Molecular Targets of Novel Therapeutics for Diabetic Kidney Disease: A New Era of Nephroprotection.International Journal of Molecular Sciences 2024 April 4
The 'Ten Commandments' for the 2023 European Society of Cardiology guidelines for the management of endocarditis.European Heart Journal 2024 April 18
A Guide to the Use of Vasopressors and Inotropes for Patients in Shock.Journal of Intensive Care Medicine 2024 April 14
Get seemless 1-tap access through your institution/university
For the best experience, use the Read mobile app
All material on this website is protected by copyright, Copyright © 1994-2024 by WebMD LLC.
This website also contains material copyrighted by 3rd parties.
By using this service, you agree to our terms of use and privacy policy.
Your Privacy Choices
You can now claim free CME credits for this literature searchClaim now
Get seemless 1-tap access through your institution/university
For the best experience, use the Read mobile app