Read by QxMD icon Read

BioData Mining

Jun Hwa Lee, Byong Chul Yoo, Yun Hwan Kim, Sun-A Ahn, Seung-Gu Yeo, Jae Youl Cho, Kyung-Hee Kim, Seung Cheol Kim
BACKGROUND: A low-mass-ion discriminant equation (LOME) was constructed to investigate whether systematic low-mass-ion (LMI) profiling could be applied to ovarian cancer (OVC) screening. RESULTS: Matrix-assisted laser desorption/ionization-time of flight (MALDI-TOF) mass spectrometry was performed to obtain mass spectral data on metabolites detected as LMIs up to a mass-to-charge ratio (m/z) of 2500 for 1184 serum samples collected from healthy individuals and patients with OVC, other types of cancer, or several types of benign tumor...
2016: BioData Mining
Wajdi Dhifli, Abdoulaye Baniré Diallo
BACKGROUND: Studying the functions and structures of proteins is important for understanding the molecular mechanisms of life. The number of publicly available protein structures has increasingly become extremely large. Still, the classification of a protein structure remains a difficult, costly, and time consuming task. The difficulties are often due to the essential role of spatial and topological structures in the classification of protein structures. RESULTS: We propose ProtNN, a novel classification approach for protein 3D-structures...
2016: BioData Mining
Spiros C Denaxas, Folkert W Asselbergs, Jason H Moore
Modern cohort studies include self-reported measures on disease, behavior and lifestyle, sensor-based observations from mobile phones and wearables, and rich -omics data. Follow-up is often achieved through electronic health record (EHR) linkages across primary and secondary healthcare providers. Historically however, researchers typically only get to see the tip of the iceberg: coded administrative data relating to healthcare claims which mainly record billable diagnoses and procedures. The rich data generated during the clinical pathway remain submerged and inaccessible...
2016: BioData Mining
Nicola Lazzarini, Paweł Widera, Stuart Williamson, Rakesh Heer, Natalio Krasnogor, Jaume Bacardit
BACKGROUND: Functional networks play an important role in the analysis of biological processes and systems. The inference of these networks from high-throughput (-omics) data is an area of intense research. So far, the similarity-based inference paradigm (e.g. gene co-expression) has been the most popular approach. It assumes a functional relationship between genes which are expressed at similar levels across different samples. An alternative to this paradigm is the inference of relationships from the structure of machine learning models...
2016: BioData Mining
Carrie Colleen Buchanan Moore, Anna Okula Basile, John Robert Wallace, Alex Thomas Frase, Marylyn DeRiggi Ritchie
BACKGROUND: BioBin is a bioinformatics software package developed to automate the process of binning rare variants into groups for statistical association analysis using a biological knowledge-driven framework. BioBin collapses variants into biological features such as genes, pathways, evolutionary conserved regions (ECRs), protein families, regulatory regions, and others based on user-designated parameters. BioBin provides the infrastructure to create complex and interesting hypotheses in an automated fashion thereby circumventing the necessity for advanced and time consuming scripting...
2016: BioData Mining
Pau M Muñoz-Torres, Filip Rokć, Robert Belužic, Ivana Grbeša, Oliver Vugrek
BACKGROUND: Mass spectrometry (MS) are a group of a high-throughput techniques used to increase knowledge about biomolecules. They produce a large amount of data which is presented as a list of hundreds or thousands of proteins. Filtering those data efficiently is the first step for extracting biologically relevant information. The filtering may increase interest by merging previous data with the data obtained from public databases, resulting in an accurate list of proteins which meet the predetermined conditions...
2016: BioData Mining
Jennifer Chang, Hyejin Cho, Hui-Hsien Chou
BACKGROUND: Heterogeneous biological data such as sequence matches, gene expression correlations, protein-protein interactions, and biochemical pathways can be merged and analyzed via graphs, or networks. Existing software for network analysis has limited scalability to large data sets or is only accessible to software developers as libraries. In addition, the polymorphic nature of the data sets requires a more standardized method for integration and exploration. RESULTS: Mango facilitates large network analyses with its Graph Exploration Language, automatic graph attribute handling, and real-time 3-dimensional visualization...
2016: BioData Mining
Gordon Okimoto, Ashkan Zeinalzadeh, Tom Wenska, Michael Loomis, James B Nation, Tiphaine Fabre, Maarit Tiirikainen, Brenda Hernandez, Owen Chan, Linda Wong, Sandi Kwee
BACKGROUND: Technological advances enable the cost-effective acquisition of Multi-Modal Data Sets (MMDS) composed of measurements for multiple, high-dimensional data types obtained from a common set of bio-samples. The joint analysis of the data matrices associated with the different data types of a MMDS should provide a more focused view of the biology underlying complex diseases such as cancer that would not be apparent from the analysis of a single data type alone. As multi-modal data rapidly accumulate in research laboratories and public databases such as The Cancer Genome Atlas (TCGA), the translation of such data into clinically actionable knowledge has been slowed by the lack of computational tools capable of analyzing MMDSs...
2016: BioData Mining
Artem Lysenko, Irina A Roznovăţ, Mansoor Saqi, Alexander Mazein, Christopher J Rawlings, Charles Auffray
BACKGROUND: Systems biology experiments generate large volumes of data of multiple modalities and this information presents a challenge for integration due to a mix of complexity together with rich semantics. Here, we describe how graph databases provide a powerful framework for storage, querying and envisioning of biological data. RESULTS: We show how graph databases are well suited for the representation of biological information, which is typically highly connected, semi-structured and unpredictable...
2016: BioData Mining
Y-H Taguchi
BACKGROUND: The recently proposed principal component analysis (PCA) based unsupervised feature extraction (FE) has successfully been applied to various bioinformatics problems ranging from biomarker identification to the screening of disease causing genes using gene expression/epigenetic profiles. However, the conditions required for its successful use and the mechanisms involved in how it outperforms other supervised methods is unknown, because PCA based unsupervised FE has only been applied to challenging (i...
2016: BioData Mining
Riku Louhimo, Marko Laakso, Denis Belitskin, Juha Klefström, Rainer Lehtonen, Sampsa Hautaniemi
BACKGROUND: Genomic alterations affecting drug target proteins occur in several tumor types and are prime candidates for patient-specific tailored treatments. Increasingly, patients likely to benefit from targeted cancer therapy are selected based on molecular alterations. The selection of a precision therapy benefiting most patients is challenging but can be enhanced with integration of multiple types of molecular data. Data integration approaches for drug prioritization have successfully integrated diverse molecular data but do not take full advantage of existing data and literature...
2016: BioData Mining
Katherine Icay, Ping Chen, Alejandra Cervera, Ville Rantanen, Rainer Lehtonen, Sampsa Hautaniemi
BACKGROUND: Large-scale sequencing experiments are complex and require a wide spectrum of computational tools to extract and interpret relevant biological information. This is especially true in projects where individual processing and integrated analysis of both small RNA and complementary RNA data is needed. Such studies would benefit from a computational workflow that is easy to implement and standardizes the processing and analysis of both sequenced data types. RESULTS: We developed SePIA (Sequence Processing, Integration, and Analysis), a comprehensive small RNA and RNA workflow...
2016: BioData Mining
Yile Zhang, Yau Shu Wong, Jian Deng, Cristina Anton, Stephan Gabos, Weiping Zhang, Dorothy Yu Huang, Can Jin
BACKGROUND: Real Time Cell Analysis (RTCA) technology is used to monitor cellular changes continuously over the entire exposure period. Combining with different testing concentrations, the profiles have potential in probing the mode of action (MOA) of the testing substances. RESULTS: In this paper, we present machine learning approaches for MOA assessment. Computational tools based on artificial neural network (ANN) and support vector machine (SVM) are developed to analyze the time-concentration response curves (TCRCs) of human cell lines responding to tested chemicals...
2016: BioData Mining
Ruowang Li, Scott M Dudek, Dokyoon Kim, Molly A Hall, Yuki Bradford, Peggy L Peissig, Murray H Brilliant, James G Linneman, Catherine A McCarty, Le Bao, Marylyn D Ritchie
BACKGROUND: The future of medicine is moving towards the phase of precision medicine, with the goal to prevent and treat diseases by taking inter-individual variability into account. A large part of the variability lies in our genetic makeup. With the fast paced improvement of high-throughput methods for genome sequencing, a tremendous amount of genetics data have already been generated. The next hurdle for precision medicine is to have sufficient computational tools for analyzing large sets of data...
2016: BioData Mining
Maha Soliman, Olfa Nasraoui, Nigel G F Cooper
BACKGROUND: The volume of biomedical literature and its underlying knowledge base is rapidly expanding, making it beyond the ability of a single human being to read through all the literature. Several automated methods have been developed to help make sense of this dilemma. The present study reports on the results of a text mining approach to extract gene interactions from the data warehouse of published experimental results which are then used to benchmark an interaction network associated with glaucoma...
2016: BioData Mining
Franco Milicchio, Rebecca Rose, Jiang Bian, Jae Min, Mattia Prosperi
BACKGROUND: High-throughput or next-generation sequencing (NGS) technologies have become an established and affordable experimental framework in biological and medical sciences for all basic and translational research. Processing and analyzing NGS data is challenging. NGS data are big, heterogeneous, sparse, and error prone. Although a plethora of tools for NGS data analysis has emerged in the past decade, (i) software development is still lagging behind data generation capabilities, and (ii) there is a 'cultural' gap between the end user and the developer...
2016: BioData Mining
Jason H Moore, John H Holmes
Biomedical informatics has become a central focus for many academic medical centers and universities as biomedical research because increasingly reliant on the processing, analysis, and interpretation of large volumes of data, information, and knowledge. We posit here that this is the beginning of the golden era of biomedical informatics with opportunity for this maturing discipline to have a substantial impact on the biggest questions and challenges facing efforts to improve human health and the healthcare system...
2016: BioData Mining
Jing Li, James D Malley, Angeline S Andrew, Margaret R Karagas, Jason H Moore
BACKGROUND: Identifying gene-gene interactions is essential to understand disease susceptibility and to detect genetic architectures underlying complex diseases. Here, we aimed at developing a permutation-based methodology relying on a machine learning method, random forest (RF), to detect gene-gene interactions. Our approach called permuted random forest (pRF) which identified the top interacting single nucleotide polymorphism (SNP) pairs by estimating how much the power of a random forest classification model is influenced by removing pairwise interactions...
2016: BioData Mining
Ma Liang, Castle Raley, Xin Zheng, Geetha Kutty, Emile Gogineni, Brad T Sherman, Qiang Sun, Xiongfong Chen, Thomas Skelly, Kristine Jones, Robert Stephens, Bin Zhou, William Lau, Calvin Johnson, Tomozumi Imamichi, Minkang Jiang, Robin Dewar, Richard A Lempicki, Bao Tran, Joseph A Kovacs, Da Wei Huang
BACKGROUND: Gene isoforms are commonly found in both prokaryotes and eukaryotes. Since each isoform may perform a specific function in response to changing environmental conditions, studying the dynamics of gene isoforms is important in understanding biological processes and disease conditions. However, genome-wide identification of gene isoforms is technically challenging due to the high degree of sequence identity among isoforms. Traditional targeted sequencing approach, involving Sanger sequencing of plasmid-cloned PCR products, has low throughput and is very tedious and time-consuming...
2016: BioData Mining
Minjun Huang, Britney E Graham, Ge Zhang, Reed Harder, Nuri Kodaman, Jason H Moore, Louis Muglia, Scott M Williams
Genetic studies of human diseases have identified many variants associated with pathogenesis and severity. However, most studies have used only statistical association to assess putative relationships to disease, and ignored other factors for evaluation. For example, evolution is a factor that has shaped disease risk, changing allele frequencies as human populations migrated into and inhabited new environments. Since many common variants differ among populations in frequency, as does disease prevalence, we hypothesized that patterns of disease and population structure, taken together, will inform association studies...
2016: BioData Mining
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"