journal
MENU ▼
Read by QxMD icon Read
search

BioData Mining

journal
https://www.readbyqxmd.com/read/28331548/discovering-feature-relevancy-and-dependency-by-kernel-guided-probabilistic-model-building-evolution
#1
Nestor Rodriguez, Sergio Rojas-Galeano
BACKGROUND: Discovering relevant features (biomarkers) that discriminate etiologies of a disease is useful to provide biomedical researchers with candidate targets for further laboratory experimentation while saving costs; dependencies among biomarkers may suggest additional valuable information, for example, to characterize complex epistatic relationships from genetic data. The use of classifiers to guide the search for biomarkers (the so-called wrapper approach) has been widely studied...
2017: BioData Mining
https://www.readbyqxmd.com/read/28293298/rapid-development-of-entity-based-data-models-for-bioinformatics-with-persistence-object-oriented-design-and-structured-interfaces
#2
Elishai Ezra Tsur
Databases are imperative for research in bioinformatics and computational biology. Current challenges in database design include data heterogeneity and context-dependent interconnections between data entities. These challenges drove the development of unified data interfaces and specialized databases. The curation of specialized databases is an ever-growing challenge due to the introduction of new data sources and the emergence of new relational connections between established datasets. Here, an open-source framework for the curation of specialized databases is proposed...
2017: BioData Mining
https://www.readbyqxmd.com/read/28261328/label-free-data-standardization-for-clinical-metabolomics
#3
Petr G Lokhov, Dmitri L Maslov, Oleg N Kharibin, Elena E Balashova, Alexander I Archakov
BACKGROUND: In metabolomics, thousands of substances can be detected in a single assay. This capacity motivates the development of metabolomics testing, which is currently a very promising option for improving laboratory diagnostics. However, the simultaneous measurement of an enormous number of substances leads to metabolomics data often representing concentrations only in conditional units, while laboratory diagnostics generally require actual concentrations. To convert metabolomics data to actual concentrations, calibration curves need to be generated for each substance, and this process represents a significant challenge due to the number of substances that are present in the metabolomics data...
2017: BioData Mining
https://www.readbyqxmd.com/read/28239419/variant-set-enrichment-an-r-package-to-identify-disease-associated-functional-genomic-regions
#4
Musaddeque Ahmed, Richard C Sallari, Haiyang Guo, Jason H Moore, Housheng Hansen He, Mathieu Lupien
BACKGROUND: Genetic predispositions to diseases populate the noncoding regions of the human genome. Delineating their functional basis can inform on the mechanisms contributing to disease development. However, this remains a challenge due to the poor characterization of the noncoding genome. Here, we propose an R package that can pinpoint which genomic features are etiologically important based on the genetic predispositions. RESULTS: Variant Set Enrichment (VSE) is an R package to calculate the enrichment of a set of disease-associated variants across functionally annotated genomic regions, consequently highlighting the mechanisms important in the etiology of the disease studied...
2017: BioData Mining
https://www.readbyqxmd.com/read/28228844/erratum-to-meta-analytic-support-vector-machine-for-integrating-multiple-omics-data
#5
SungHwan Kim, Jae-Hwan Jhong, JungJun Lee, Ja-Yong Koo
[This corrects the article DOI: 10.1186/s13040-017-0126-8.].
2017: BioData Mining
https://www.readbyqxmd.com/read/28203277/semantics-based-plausible-reasoning-to-extend-the-knowledge-coverage-of-medical-knowledge-bases-for-improved-clinical-decision-support
#6
Hossein Mohammadhassanzadeh, William Van Woensel, Samina Raza Abidi, Syed Sibte Raza Abidi
BACKGROUND: Capturing complete medical knowledge is challenging-often due to incomplete patient Electronic Health Records (EHR), but also because of valuable, tacit medical knowledge hidden away in physicians' experiences. To extend the coverage of incomplete medical knowledge-based systems beyond their deductive closure, and thus enhance their decision-support capabilities, we argue that innovative, multi-strategy reasoning approaches should be applied. In particular, plausible reasoning mechanisms apply patterns from human thought processes, such as generalization, similarity and interpolation, based on attributional, hierarchical, and relational knowledge...
2017: BioData Mining
https://www.readbyqxmd.com/read/28191039/elevated-transcriptional-levels-of-aldolase-a-aldoa-associates-with-cell-cycle-related-genes-in-patients-with-nsclc-and-several-solid-tumors
#7
Fan Zhang, Jie-Diao Lin, Xiao-Yu Zuo, Yi-Xuan Zhuang, Chao-Qun Hong, Guo-Jun Zhang, Xiao-Jiang Cui, Yu-Kun Cui
BACKGROUND: Aldolase A (ALDOA) is one of the glycolytic enzymes primarily found in the developing embryo and adult muscle. Recently, a new role of ALDOA in several cancers has been proposed. However, the underlying mechanism remains obscure and inconsistent. In this study, we tried to investigate ALDOA-associated (AA) genes using available microarray datasets to help elucidating the role of ALDOA in cancer. RESULTS: In the dataset of patients with non-small-cell lung cancer (NSCLC, E-GEOD-19188), 3448 differentially expressed genes (DEGs) including ALDOA were identified, in which 710 AA genes were found to be positively associated with ALDOA...
2017: BioData Mining
https://www.readbyqxmd.com/read/28184252/gene-set-analysis-controlling-for-length-bias-in-rna-seq-experiments
#8
Xing Ren, Qiang Hu, Song Liu, Jianmin Wang, Jeffrey C Miecznikowski
BACKGROUND: In gene set analysis, the researchers are interested in determining the gene sets that are significantly correlated with an outcome, e.g. disease status or treatment. With the rapid development of high throughput sequencing technologies, Ribonucleic acid sequencing (RNA-seq) has become an important alternative to traditional expression arrays in gene expression studies. Challenges exist in adopting the existent algorithms to RNA-seq data given the intrinsic difference of the technologies and data...
2017: BioData Mining
https://www.readbyqxmd.com/read/28184251/a-feature-selection-method-based-on-multiple-kernel-learning-with-expression-profiles-of-different-types
#9
Wei Du, Zhongbo Cao, Tianci Song, Ying Li, Yanchun Liang
BACKGROUND: With the development of high-throughput technology, the researchers can acquire large number of expression data with different types from several public databases. Because most of these data have small number of samples and hundreds or thousands features, how to extract informative features from expression data effectively and robustly using feature selection technique is challenging and crucial. So far, a mass of many feature selection approaches have been proposed and applied to analyse expression data of different types...
2017: BioData Mining
https://www.readbyqxmd.com/read/28168005/mining-pathway-associations-for-disease-related-pathway-activity-analysis-based-on-gene-expression-and-methylation-data
#10
Hyeonjeong Lee, Miyoung Shin
BACKGROUND: The problem of discovering genetic markers as disease signatures is of great significance for the successful diagnosis, treatment, and prognosis of complex diseases. Even if many earlier studies worked on identifying disease markers from a variety of biological resources, they mostly focused on the markers of genes or gene-sets (i.e., pathways). However, these markers may not be enough to explain biological interactions between genetic variables that are related to diseases...
2017: BioData Mining
https://www.readbyqxmd.com/read/28149325/meta-analytic-support-vector-machine-for-integrating-multiple-omics-data
#11
SungHwan Kim, Jae-Hwan Jhong, JungJun Lee, Ja-Yong Koo
BACKGROUND: Of late, high-throughput microarray and sequencing data have been extensively used to monitor biomarkers and biological processes related to many diseases. Under this circumstance, the support vector machine (SVM) has been popularly used and been successful for gene selection in many applications. Despite surpassing benefits of the SVMs, single data analysis using small- and mid-size of data inevitably runs into the problem of low reproducibility and statistical power. To address this problem, we propose a meta-analytic support vector machine (Meta-SVM) that can accommodate multiple omics data, making it possible to detect consensus genes associated with diseases across studies...
2017: BioData Mining
https://www.readbyqxmd.com/read/28127402/accurate-prediction-of-protein-relative-solvent-accessibility-using-a-balanced-model
#12
Wei Wu, Zhiheng Wang, Peisheng Cong, Tonghua Li
BACKGROUND: Protein relative solvent accessibility provides insight into understanding protein structure and function. Prediction of protein relative solvent accessibility is often the first stage of predicting other protein properties. Recent predictors of relative solvent accessibility discriminate against exposed regions as compared with buried regions, resulting in higher prediction accuracy associated with buried regions relative to exposed regions. METHODS: Here, we propose a more accurate and balanced predictor of protein relative solvent accessibility...
2017: BioData Mining
https://www.readbyqxmd.com/read/28031747/the-interaction-network-ontology-supported-modeling-and-mining-of-complex-interactions-represented-with-multiple-keywords-in-biomedical-literature
#13
Arzucan Özgür, Junguk Hur, Yongqun He
BACKGROUND: The Interaction Network Ontology (INO) logically represents biological interactions, pathways, and networks. INO has been demonstrated to be valuable in providing a set of structured ontological terms and associated keywords to support literature mining of gene-gene interactions from biomedical literature. However, previous work using INO focused on single keyword matching, while many interactions are represented with two or more interaction keywords used in combination. METHODS: This paper reports our extension of INO to include combinatory patterns of two or more literature mining keywords co-existing in one sentence to represent specific INO interaction classes...
2016: BioData Mining
https://www.readbyqxmd.com/read/27999618/complex-systems-analysis-of-bladder-cancer-susceptibility-reveals-a-role-for-decarboxylase-activity-in-two-genome-wide-association-studies
#14
Samantha Cheng, Angeline S Andrew, Peter C Andrews, Jason H Moore
BACKGROUND: Bladder cancer is common disease with a complex etiology that is likely due to many different genetic and environmental factors. The goal of this study was to embrace this complexity using a bioinformatics analysis pipeline designed to use machine learning to measure synergistic interactions between single nucleotide polymorphisms (SNPs) in two genome-wide association studies (GWAS) and then to assess their enrichment within functional groups defined by Gene Ontology. The significance of the results was evaluated using permutation testing and those results that replicated between the two GWAS data sets were reported...
2016: BioData Mining
https://www.readbyqxmd.com/read/27990177/matk-qr-classifier-a-patterns-based-approach-for-plant-species-identification
#15
Ravi Prabhakar More, Rupali Chandrashekhar Mane, Hemant J Purohit
BACKGROUND: DNA barcoding is widely used and most efficient approach that facilitates rapid and accurate identification of plant species based on the short standardized segment of the genome. The nucleotide sequences of maturaseK (matK) and ribulose-1, 5-bisphosphate carboxylase (rbcL) marker loci are commonly used in plant species identification. Here, we present a new and highly efficient approach for identifying a unique set of discriminating nucleotide patterns to generate a signature (i...
2016: BioData Mining
https://www.readbyqxmd.com/read/27980679/missel-a-method-to-identify-a-large-number-of-small-species-specific-genomic-subsequences-and-its-application-to-viruses-classification
#16
Giulia Fiscon, Emanuel Weitschek, Eleonora Cella, Alessandra Lo Presti, Marta Giovanetti, Muhammed Babakir-Mina, Marco Ciotti, Massimo Ciccozzi, Alessandra Pierangeli, Paola Bertolazzi, Giovanni Felici
BACKGROUND: Continuous improvements in next generation sequencing technologies led to ever-increasing collections of genomic sequences, which have not been easily characterized by biologists, and whose analysis requires huge computational effort. The classification of species emerged as one of the main applications of DNA analysis and has been addressed with several approaches, e.g., multiple alignments-, phylogenetic trees-, statistical- and character-based methods. RESULTS: We propose a supervised method based on a genetic algorithm to identify small genomic subsequences that discriminate among different species...
2016: BioData Mining
https://www.readbyqxmd.com/read/27980678/adaptive-swarm-cluster-based-dynamic-multi-objective-synthetic-minority-oversampling-technique-algorithm-for-tackling-binary-imbalanced-datasets-in-biomedical-data-classification
#17
Jinyan Li, Simon Fong, Yunsick Sung, Kyungeun Cho, Raymond Wong, Kelvin K L Wong
BACKGROUND: An imbalanced dataset is defined as a training dataset that has imbalanced proportions of data in both interesting and uninteresting classes. Often in biomedical applications, samples from the stimulating class are rare in a population, such as medical anomalies, positive clinical tests, and particular diseases. Although the target samples in the primitive dataset are small in number, the induction of a classification model over such training data leads to poor prediction performance due to insufficient training from the minority class...
2016: BioData Mining
https://www.readbyqxmd.com/read/27891179/compensation-of-feature-selection-biases-accompanied-with-improved-predictive-performance-for-binary-classification-by-using-a-novel-ensemble-feature-selection-approach
#18
Ursula Neumann, Mona Riemenschneider, Jan-Peter Sowa, Theodor Baars, Julia Kälsch, Ali Canbay, Dominik Heider
MOTIVATION: Biomarker discovery methods are essential to identify a minimal subset of features (e.g., serum markers in predictive medicine) that are relevant to develop prediction models with high accuracy. By now, there exist diverse feature selection methods, which either are embedded, combined, or independent of predictive learning algorithms. Many preceding studies showed the defectiveness of single feature selection results, which cause difficulties for professionals in a variety of fields (e...
2016: BioData Mining
https://www.readbyqxmd.com/read/27833658/considerations-for-higher-efficiency-and-productivity-in-research-activities
#19
EDITORIAL
Diego A Forero, Jason H Moore
There are several factors that are known to affect research productivity; some of them imply the need for large financial investments and others are related to work styles. There are some articles that provide suggestions for early career scientists (PhD students and postdocs) but few publications are oriented to professors about scientific leadership. As academic mentoring might be useful at all levels of experience, in this note we suggest several key considerations for higher efficiency and productivity in academic and research activities...
2016: BioData Mining
https://www.readbyqxmd.com/read/27822312/on-the-evaluation-of-the-fidelity-of-supervised-classifiers-in-the-prediction-of-chimeric-rnas
#20
Sacha Beaumeunier, Jérôme Audoux, Anthony Boureux, Florence Ruffle, Thérèse Commes, Nicolas Philippe, Ronnie Alves
BACKGROUND: High-throughput sequencing technology and bioinformatics have identified chimeric RNAs (chRNAs), raising the possibility of chRNAs expressing particularly in diseases can be used as potential biomarkers in both diagnosis and prognosis. RESULTS: The task of discriminating true chRNAs from the false ones poses an interesting Machine Learning (ML) challenge. First of all, the sequencing data may contain false reads due to technical artifacts and during the analysis process, bioinformatics tools may generate false positives due to methodological biases...
2016: BioData Mining
journal
journal
41781
1
2
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read
×

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"