journal
MENU ▼
Read by QxMD icon Read
search

BioData Mining

journal
https://www.readbyqxmd.com/read/28031747/the-interaction-network-ontology-supported-modeling-and-mining-of-complex-interactions-represented-with-multiple-keywords-in-biomedical-literature
#1
Arzucan Özgür, Junguk Hur, Yongqun He
BACKGROUND: The Interaction Network Ontology (INO) logically represents biological interactions, pathways, and networks. INO has been demonstrated to be valuable in providing a set of structured ontological terms and associated keywords to support literature mining of gene-gene interactions from biomedical literature. However, previous work using INO focused on single keyword matching, while many interactions are represented with two or more interaction keywords used in combination. METHODS: This paper reports our extension of INO to include combinatory patterns of two or more literature mining keywords co-existing in one sentence to represent specific INO interaction classes...
2016: BioData Mining
https://www.readbyqxmd.com/read/27999618/complex-systems-analysis-of-bladder-cancer-susceptibility-reveals-a-role-for-decarboxylase-activity-in-two-genome-wide-association-studies
#2
Samantha Cheng, Angeline S Andrew, Peter C Andrews, Jason H Moore
BACKGROUND: Bladder cancer is common disease with a complex etiology that is likely due to many different genetic and environmental factors. The goal of this study was to embrace this complexity using a bioinformatics analysis pipeline designed to use machine learning to measure synergistic interactions between single nucleotide polymorphisms (SNPs) in two genome-wide association studies (GWAS) and then to assess their enrichment within functional groups defined by Gene Ontology. The significance of the results was evaluated using permutation testing and those results that replicated between the two GWAS data sets were reported...
2016: BioData Mining
https://www.readbyqxmd.com/read/27990177/matk-qr-classifier-a-patterns-based-approach-for-plant-species-identification
#3
Ravi Prabhakar More, Rupali Chandrashekhar Mane, Hemant J Purohit
BACKGROUND: DNA barcoding is widely used and most efficient approach that facilitates rapid and accurate identification of plant species based on the short standardized segment of the genome. The nucleotide sequences of maturaseK (matK) and ribulose-1, 5-bisphosphate carboxylase (rbcL) marker loci are commonly used in plant species identification. Here, we present a new and highly efficient approach for identifying a unique set of discriminating nucleotide patterns to generate a signature (i...
2016: BioData Mining
https://www.readbyqxmd.com/read/27980679/missel-a-method-to-identify-a-large-number-of-small-species-specific-genomic-subsequences-and-its-application-to-viruses-classification
#4
Giulia Fiscon, Emanuel Weitschek, Eleonora Cella, Alessandra Lo Presti, Marta Giovanetti, Muhammed Babakir-Mina, Marco Ciotti, Massimo Ciccozzi, Alessandra Pierangeli, Paola Bertolazzi, Giovanni Felici
BACKGROUND: Continuous improvements in next generation sequencing technologies led to ever-increasing collections of genomic sequences, which have not been easily characterized by biologists, and whose analysis requires huge computational effort. The classification of species emerged as one of the main applications of DNA analysis and has been addressed with several approaches, e.g., multiple alignments-, phylogenetic trees-, statistical- and character-based methods. RESULTS: We propose a supervised method based on a genetic algorithm to identify small genomic subsequences that discriminate among different species...
2016: BioData Mining
https://www.readbyqxmd.com/read/27980678/adaptive-swarm-cluster-based-dynamic-multi-objective-synthetic-minority-oversampling-technique-algorithm-for-tackling-binary-imbalanced-datasets-in-biomedical-data-classification
#5
Jinyan Li, Simon Fong, Yunsick Sung, Kyungeun Cho, Raymond Wong, Kelvin K L Wong
BACKGROUND: An imbalanced dataset is defined as a training dataset that has imbalanced proportions of data in both interesting and uninteresting classes. Often in biomedical applications, samples from the stimulating class are rare in a population, such as medical anomalies, positive clinical tests, and particular diseases. Although the target samples in the primitive dataset are small in number, the induction of a classification model over such training data leads to poor prediction performance due to insufficient training from the minority class...
2016: BioData Mining
https://www.readbyqxmd.com/read/27891179/compensation-of-feature-selection-biases-accompanied-with-improved-predictive-performance-for-binary-classification-by-using-a-novel-ensemble-feature-selection-approach
#6
Ursula Neumann, Mona Riemenschneider, Jan-Peter Sowa, Theodor Baars, Julia Kälsch, Ali Canbay, Dominik Heider
MOTIVATION: Biomarker discovery methods are essential to identify a minimal subset of features (e.g., serum markers in predictive medicine) that are relevant to develop prediction models with high accuracy. By now, there exist diverse feature selection methods, which either are embedded, combined, or independent of predictive learning algorithms. Many preceding studies showed the defectiveness of single feature selection results, which cause difficulties for professionals in a variety of fields (e...
2016: BioData Mining
https://www.readbyqxmd.com/read/27833658/considerations-for-higher-efficiency-and-productivity-in-research-activities
#7
EDITORIAL
Diego A Forero, Jason H Moore
There are several factors that are known to affect research productivity; some of them imply the need for large financial investments and others are related to work styles. There are some articles that provide suggestions for early career scientists (PhD students and postdocs) but few publications are oriented to professors about scientific leadership. As academic mentoring might be useful at all levels of experience, in this note we suggest several key considerations for higher efficiency and productivity in academic and research activities...
2016: BioData Mining
https://www.readbyqxmd.com/read/27822312/on-the-evaluation-of-the-fidelity-of-supervised-classifiers-in-the-prediction-of-chimeric-rnas
#8
Sacha Beaumeunier, Jérôme Audoux, Anthony Boureux, Florence Ruffle, Thérèse Commes, Nicolas Philippe, Ronnie Alves
BACKGROUND: High-throughput sequencing technology and bioinformatics have identified chimeric RNAs (chRNAs), raising the possibility of chRNAs expressing particularly in diseases can be used as potential biomarkers in both diagnosis and prognosis. RESULTS: The task of discriminating true chRNAs from the false ones poses an interesting Machine Learning (ML) challenge. First of all, the sequencing data may contain false reads due to technical artifacts and during the analysis process, bioinformatics tools may generate false positives due to methodological biases...
2016: BioData Mining
https://www.readbyqxmd.com/read/27785153/developing-a-modular-architecture-for-creation-of-rule-based-clinical-diagnostic-criteria
#9
Na Hong, Jyotishman Pathak, Christopher G Chute, Guoqian Jiang
BACKGROUND: With recent advances in computerized patient records system, there is an urgent need for producing computable and standards-based clinical diagnostic criteria. Notably, constructing rule-based clinical diagnosis criteria has become one of the goals in the International Classification of Diseases (ICD)-11 revision. However, few studies have been done in building a unified architecture to support the need for diagnostic criteria computerization. In this study, we present a modular architecture for enabling the creation of rule-based clinical diagnostic criteria leveraging Semantic Web technologies...
2016: BioData Mining
https://www.readbyqxmd.com/read/27777627/fedrr-fast-exhaustive-detection-of-redundant-hierarchical-relations-for-quality-improvement-of-large-biomedical-ontologies
#10
Guangming Xing, Guo-Qiang Zhang, Licong Cui
BACKGROUND: Redundant hierarchical relations refer to such patterns as two paths from one concept to another, one with length one (direct) and the other with length greater than one (indirect). Each redundant relation represents a possibly unintended defect that needs to be corrected in the ontology quality assurance process. Detecting and eliminating redundant relations would help improve the results of all methods relying on the relevant ontological systems as knowledge source, such as the computation of semantic distance between concepts and for ontology matching and alignment...
2016: BioData Mining
https://www.readbyqxmd.com/read/27752286/low-mass-ion-discriminant-equation-lome-for-ovarian-cancer-screening
#11
Jun Hwa Lee, Byong Chul Yoo, Yun Hwan Kim, Sun-A Ahn, Seung-Gu Yeo, Jae Youl Cho, Kyung-Hee Kim, Seung Cheol Kim
BACKGROUND: A low-mass-ion discriminant equation (LOME) was constructed to investigate whether systematic low-mass-ion (LMI) profiling could be applied to ovarian cancer (OVC) screening. RESULTS: Matrix-assisted laser desorption/ionization-time of flight (MALDI-TOF) mass spectrometry was performed to obtain mass spectral data on metabolites detected as LMIs up to a mass-to-charge ratio (m/z) of 2500 for 1184 serum samples collected from healthy individuals and patients with OVC, other types of cancer, or several types of benign tumor...
2016: BioData Mining
https://www.readbyqxmd.com/read/27688811/protnn-fast-and-accurate-protein-3d-structure-classification-in-structural-and-topological-space
#12
Wajdi Dhifli, Abdoulaye Baniré Diallo
BACKGROUND: Studying the functions and structures of proteins is important for understanding the molecular mechanisms of life. The number of publicly available protein structures has increasingly become extremely large. Still, the classification of a protein structure remains a difficult, costly, and time consuming task. The difficulties are often due to the essential role of spatial and topological structures in the classification of protein structures. RESULTS: We propose ProtNN, a novel classification approach for protein 3D-structures...
2016: BioData Mining
https://www.readbyqxmd.com/read/27688810/the-tip-of-the-iceberg-challenges-of-accessing-hospital-electronic-health-record-data-for-biological-data-mining
#13
EDITORIAL
Spiros C Denaxas, Folkert W Asselbergs, Jason H Moore
Modern cohort studies include self-reported measures on disease, behavior and lifestyle, sensor-based observations from mobile phones and wearables, and rich -omics data. Follow-up is often achieved through electronic health record (EHR) linkages across primary and secondary healthcare providers. Historically however, researchers typically only get to see the tip of the iceberg: coded administrative data relating to healthcare claims which mainly record billable diagnoses and procedures. The rich data generated during the clinical pathway remain submerged and inaccessible...
2016: BioData Mining
https://www.readbyqxmd.com/read/27597880/functional-networks-inference-from-rule-based-machine-learning-models
#14
Nicola Lazzarini, Paweł Widera, Stuart Williamson, Rakesh Heer, Natalio Krasnogor, Jaume Bacardit
BACKGROUND: Functional networks play an important role in the analysis of biological processes and systems. The inference of these networks from high-throughput (-omics) data is an area of intense research. So far, the similarity-based inference paradigm (e.g. gene co-expression) has been the most popular approach. It assumes a functional relationship between genes which are expressed at similar levels across different samples. An alternative to this paradigm is the inference of relationships from the structure of machine learning models...
2016: BioData Mining
https://www.readbyqxmd.com/read/27582876/a-biologically-informed-method-for-detecting-rare-variant-associations
#15
Carrie Colleen Buchanan Moore, Anna Okula Basile, John Robert Wallace, Alex Thomas Frase, Marylyn DeRiggi Ritchie
BACKGROUND: BioBin is a bioinformatics software package developed to automate the process of binning rare variants into groups for statistical association analysis using a biological knowledge-driven framework. BioBin collapses variants into biological features such as genes, pathways, evolutionary conserved regions (ECRs), protein families, regulatory regions, and others based on user-designated parameters. BioBin provides the infrastructure to create complex and interesting hypotheses in an automated fashion thereby circumventing the necessity for advanced and time consuming scripting...
2016: BioData Mining
https://www.readbyqxmd.com/read/27547241/msbiodat-analysis-tool-big-data-analysis-for-high-throughput-experiments
#16
Pau M Muñoz-Torres, Filip Rokć, Robert Belužic, Ivana Grbeša, Oliver Vugrek
BACKGROUND: Mass spectrometry (MS) are a group of a high-throughput techniques used to increase knowledge about biomolecules. They produce a large amount of data which is presented as a list of hundreds or thousands of proteins. Filtering those data efficiently is the first step for extracting biologically relevant information. The filtering may increase interest by merging previous data with the data obtained from public databases, resulting in an accurate list of proteins which meet the predetermined conditions...
2016: BioData Mining
https://www.readbyqxmd.com/read/27489569/mango-combining-and-analyzing-heterogeneous-biological-networks
#17
Jennifer Chang, Hyejin Cho, Hui-Hsien Chou
BACKGROUND: Heterogeneous biological data such as sequence matches, gene expression correlations, protein-protein interactions, and biochemical pathways can be merged and analyzed via graphs, or networks. Existing software for network analysis has limited scalability to large data sets or is only accessible to software developers as libraries. In addition, the polymorphic nature of the data sets requires a more standardized method for integration and exploration. RESULTS: Mango facilitates large network analyses with its Graph Exploration Language, automatic graph attribute handling, and real-time 3-dimensional visualization...
2016: BioData Mining
https://www.readbyqxmd.com/read/27478503/joint-analysis-of-multiple-high-dimensional-data-types-using-sparse-matrix-approximations-of-rank-1-with-applications-to-ovarian-and-liver-cancer
#18
Gordon Okimoto, Ashkan Zeinalzadeh, Tom Wenska, Michael Loomis, James B Nation, Tiphaine Fabre, Maarit Tiirikainen, Brenda Hernandez, Owen Chan, Linda Wong, Sandi Kwee
BACKGROUND: Technological advances enable the cost-effective acquisition of Multi-Modal Data Sets (MMDS) composed of measurements for multiple, high-dimensional data types obtained from a common set of bio-samples. The joint analysis of the data matrices associated with the different data types of a MMDS should provide a more focused view of the biology underlying complex diseases such as cancer that would not be apparent from the analysis of a single data type alone. As multi-modal data rapidly accumulate in research laboratories and public databases such as The Cancer Genome Atlas (TCGA), the translation of such data into clinically actionable knowledge has been slowed by the lack of computational tools capable of analyzing MMDSs...
2016: BioData Mining
https://www.readbyqxmd.com/read/27462371/representing-and-querying-disease-networks-using-graph-databases
#19
REVIEW
Artem Lysenko, Irina A Roznovăţ, Mansoor Saqi, Alexander Mazein, Christopher J Rawlings, Charles Auffray
BACKGROUND: Systems biology experiments generate large volumes of data of multiple modalities and this information presents a challenge for integration due to a mix of complexity together with rich semantics. Here, we describe how graph databases provide a powerful framework for storage, querying and envisioning of biological data. RESULTS: We show how graph databases are well suited for the representation of biological information, which is typically highly connected, semi-structured and unpredictable...
2016: BioData Mining
https://www.readbyqxmd.com/read/27366210/principal-component-analysis-based-unsupervised-feature-extraction-applied-to-budding-yeast-temporally-periodic-gene-expression
#20
Y-H Taguchi
BACKGROUND: The recently proposed principal component analysis (PCA) based unsupervised feature extraction (FE) has successfully been applied to various bioinformatics problems ranging from biomarker identification to the screening of disease causing genes using gene expression/epigenetic profiles. However, the conditions required for its successful use and the mechanisms involved in how it outperforms other supervised methods is unknown, because PCA based unsupervised FE has only been applied to challenging (i...
2016: BioData Mining
journal
journal
41781
1
2
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read
×

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"