Read by QxMD icon Read


Bin Liu, Li Yao, Dapeng Han
Classification is an important part of resident space objects (RSOs) identification, which is a main focus of space situational awareness. Owing to the absence of some features caused by the limited and uncertain observations, RSO classification remains a difficult task. In this paper, an ontology for RSO classification named OntoStar is built upon domain knowledge and machine learning rules. Then data describing RSO are represented by OntoStar. A demo shows how an RSO is classified based on OntoStar. It is also shown in the demo that traceable and comprehensible reasons for the classification can be given, hence the classification can be checked and validated...
2016: SpringerPlus
L Sun, J-C Xu, W Wang, Y Yin
Cancer subtype recognition and feature selection are important problems in the diagnosis and treatment of tumors. Here, we propose a novel gene selection approach applied to gene expression data classification. First, two classical feature reduction methods including locally linear embedding (LLE) and rough set (RS) are summarized. The advantages and disadvantages of these algorithms were analyzed and an optimized model for tumor gene selection was developed based on LLE and neighborhood RS (NRS). Bhattacharyya distance was introduced to delete irrelevant genes, pair-wise redundant analysis was performed to remove strongly correlated genes, and the wavelet soft threshold was determined to eliminate noise in the gene datasets...
August 30, 2016: Genetics and Molecular Research: GMR
Nivethitha Somu, M R Gauthama Raman, Kannan Kirthivasan, V S Shankar Sriram
The impact of internet and information systems across various domains have resulted in substantial generation of multidimensional datasets. The use of data mining and knowledge discovery techniques to extract the original information contained in the multidimensional datasets play a significant role in the exploitation of complete benefit provided by them. The presence of large number of features in the high dimensional datasets incurs high computational cost in terms of computing power and time. Hence, feature selection technique has been commonly used to build robust machine learning models to select a subset of relevant features which projects the maximal information content of the original dataset...
November 2016: Journal of Medical Systems
Lina Zhang, Chengjin Zhang, Rui Gao, Runtao Yang, Qing Song
Antioxidant proteins perform significant functions in maintaining oxidation/antioxidation balance and have potential therapies for some diseases. Accurate identification of antioxidant proteins could contribute to revealing physiological processes of oxidation/antioxidation balance and developing novel antioxidation-based drugs. In this study, an ensemble method is presented to predict antioxidant proteins with hybrid features, incorporating SSI (Secondary Structure Information), PSSM (Position Specific Scoring Matrix), RSA (Relative Solvent Accessibility), and CTD (Composition, Transition, Distribution)...
2016: PloS One
Andrzej W Przybyszewski, Mark Kon, Stanislaw Szlufik, Artur Szymanski, Piotr Habela, Dariusz M Koziorowski
We still do not know how the brain and its computations are affected by nerve cell deaths and their compensatory learning processes, as these develop in neurodegenerative diseases (ND). Compensatory learning processes are ND symptoms usually observed at a point when the disease has already affected large parts of the brain. We can register symptoms of ND such as motor and/or mental disorders (dementias) and even provide symptomatic relief, though the structural effects of these are in most cases not yet understood...
2016: Sensors
Daniel Firnkorn, Sebastian Merker, Matthias Ganzinger, Thomas Muley, Petra Knaup
In medical science, modern IT concepts are increasingly important to gather new findings out of complex diseases. Data Warehouses (DWH) as central data repository systems play a key role by providing standardized, high-quality and secure medical data for effective analyses. However, DWHs in medicine must fulfil various requirements concerning data privacy and the ability to describe the complexity of (rare) disease phenomena. Here, i2b2 and tranSMART are free alternatives representing DWH solutions especially developed for medical informatics purposes...
2016: Studies in Health Technology and Informatics
Patrícia Fátima Souza Novais, Thabata Koester Weber, Ney Lemke, Rozangela Verlengia, Alex Harley Crisp, Irineu Rasera-Junior, Maria Rita Marques de Oliveira
This study aimed to investigate the association between twelve gene polymorphisms and body weight loss, 12 months after Roux-en-Y gastric bypass (RYGB) surgery. Three hundred and fifty-one obese women participated in this study. The statistical software WEKA was used to identify which gene polymorphisms were potential predictors of postoperative percentage of excess weight loss (%EWL). Our results indicate that the only gene polymorphism that predicted %EWL was rs3813929, which is related to the serotonin receptor gene (5-HT2C)...
August 20, 2016: Obesity Research & Clinical Practice
Daniel F Polan, Samuel L Brady, Robert A Kaufman
There is a need for robust, fully automated whole body organ segmentation for diagnostic CT. This study investigates and optimizes a Random Forest algorithm for automated organ segmentation; explores the limitations of a Random Forest algorithm applied to the CT environment; and demonstrates segmentation accuracy in a feasibility study of pediatric and adult patients. To the best of our knowledge, this is the first study to investigate a trainable Weka segmentation (TWS) implementation using Random Forest machine-learning as a means to develop a fully automated tissue segmentation tool developed specifically for pediatric and adult examinations in a diagnostic CT environment...
September 7, 2016: Physics in Medicine and Biology
Joana Diz, Goreti Marreiros, Alberto Freitas
In the field of breast cancer research, and more than ever, new computer aided diagnosis based systems have been developed aiming to reduce diagnostic tests false-positives. Within this work, we present a data mining based approach which might support oncologists in the process of breast cancer classification and diagnosis. The present study aims to compare two breast cancer datasets and find the best methods in predicting benign/malignant lesions, breast density classification, and even for finding identification (mass / microcalcification distinction)...
September 2016: Journal of Medical Systems
Lilly Aswathy, Radhakrishnan S Jisha, Vijay H Masand, Jayant M Gajbhiye, Indira G Shibi
In this work, an attempt was made to propose new leads based on the natural scaffold Thiaplakortone-A active against malaria. The 2D QSAR studies suggested that three descriptors correlate with the anti-malarial activity with an R(2) value of 0.814. Robustness, reliability and predictive power of the model were tested by internal validation, external validation, Y-scrambling and Applicability domain analysis. HQSAR studies were carried out as an additional tool to find the sub-structural fingerprints. The CoMFA and CoMSIA models gave Q(2) values of 0...
August 5, 2016: Journal of Biomolecular Structure & Dynamics
Ferhat Demirci, Pinar Akan, Tuncay Kume, Ali Riza Sisman, Zubeyde Erbayraktar, Suleyman Sevinc
OBJECTIVES: In the field of laboratory medicine, minimizing errors and establishing standardization is only possible by predefined processes. The aim of this study was to build an experimental decision algorithm model open to improvement that would efficiently and rapidly evaluate the results of biochemical tests with critical values by evaluating multiple factors concurrently. METHODS: The experimental model was built by Weka software (Weka, Waikato, New Zealand) based on the artificial neural network method...
August 2016: American Journal of Clinical Pathology
Gloria Guerra-Jiménez, Ángel Ramos De Miguel, Juan Carlos Falcón González, Silvia Andrea Borkoski Barreiro, Daniel Pérez Plasencia, Ángel Ramos Macías
OBJECTIVE: Prediction of speech recognition (SR) and quality of life (QoL) outcomes after cochlear implantation is one of the most important challenges for otologists. By sifting through very large amounts of data, data mining reveals trends, patterns, and relationships that might otherwise have remained undetected. There are identifiable pre-implantational factors that condition the cochlear implantation outcome. Our objective is to design a data mining system to predict and classify cochlear implant (CI) predictable benefits in terms of SR and QoL in each patient...
April 2016: Journal of International Advanced Otology
Manuel Galli, Italo Zoppis, Andrew Smith, Fulvio Magni, Giancarlo Mauri
INTRODUCTION: Despite the unquestionable advantages of Matrix-Assisted Laser Desorption/Ionization Mass Spectrometry Imaging in visualizing the spatial distribution and the relative abundance of biomolecules directly on-tissue, the yielded data is complex and high dimensional. Therefore, analysis and interpretation of this huge amount of information is mathematically, statistically and computationally challenging. AREAS COVERED: This article reviews some of the challenges in data elaboration with particular emphasis on machine learning techniques employed in clinical applications, and can be useful in general as an entry point for those who want to study the computational aspects...
July 2016: Expert Review of Proteomics
Kanupriya Tiwari, Salma Jamal, Sonam Grover, Sukriti Goyal, Aditi Singh, Abhinav Grover
BACKGROUND: Tuberculosis is the second leading cause of death from an infectious disease worldwide after HIV, thus reasoning the expeditions in anti-tuberculosis research. The rising number of cases of infection by resistant forms of M. tuberculosis has given impetus to the development of novel drugs that have different targets and mechanisms of action against the bacterium. METHODS: In this study, we have used machine learning algorithms on the available high throughput screening data of inhibitors of fructose bisphosphate aldolase, an enzyme central to the glycolysis pathway in M...
June 9, 2016: Combinatorial Chemistry & High Throughput Screening
Diana Hodyna, Vasyl Kovalishyn, Sergiy Rogalsky, Volodymyr Blagodatnyi, Larisa Metelytsia
Quantitative structure-activity relationships (QSAR) of imidazolium ionic liquids (ILs) as inhibitors of C. albicans collection strains (IOA-109, KCTC 1940, ATCC 10231) have been studied. Predictive QSAR models were built using different descriptor sets for a set of 88 ionic liquids with known minimum inhibitory concentrations (MIC) against C. albicans. We applied the state-of-the-art QSAR methodologies such as WEKA Random Forest (RF) as a binary classifier, Associative Neural Networks (ASNN) and k-Nearest Neighbors (k-NN) to build continuum non-linear regression models...
2016: Current Drug Discovery Technologies
Indira G Shibi, Lilly Aswathy, Radhakrishnan S Jisha, Vijay H Masand, Jayant M Gajbhiye
Malaria parasites show resistance to most of the antimalarial drugs and hence developing antimalarials which can act on multitargets rather than a single target will be a promising strategy of drug design. Here we report a new approach by which virtual screening of 292 unique phytochemicals present in 72 traditionally important herbs is used for finding out inhibitors of plasmepsin-2 and falcipain-2 for antimalarial activity against P. falciparum. Initial screenings of the selected molecules by Random Forest algorithm model of Weka using the bioassay datasets AID 504850 and AID 2302 screened 120 out of the total 292 phytochemicals to be active against the targets...
2016: Combinatorial Chemistry & High Throughput Screening
Diana Hodyna, Vasyl Kovalishyn, Sergiy Rogalsky, Volodymyr Blagodatnyi, Kirill Petko, Larisa Metelytsia
Predictive QSAR models for the inhibitors of B. subtilis and Ps. aeruginosa among imidazolium-based ionic liquids were developed using literary data. The regression QSAR models were created through Artificial Neural Network and k-nearest neighbor procedures. The classification QSAR models were constructed using WEKA-RF (random forest) method. The predictive ability of the models was tested by fivefold cross-validation; giving q(2) = 0.77-0.92 for regression models and accuracy 83-88% for classification models...
September 2016: Chemical Biology & Drug Design
Georgios Drakakis, Saadiq Moledina, Charalampos Chomenidis, Philip Doganis, Haralambos Sarimveis
Decision trees are renowned in the computational chemistry and machine learning communities for their interpretability. Their capacity and usage are somewhat limited by the fact that they normally work on categorical data. Improvements to known decision tree algorithms are usually carried out by increasing and tweaking parameters, as well as the post-processing of the class assignment. In this work we attempted to tackle both these issues. Firstly, conditional mutual information was used as the criterion for selecting the attribute on which to split instances...
2016: Combinatorial Chemistry & High Throughput Screening
Irene Ruiz Hidalgo, Pablo Rodriguez, Jos J Rozema, Sorcha Ní Dhubhghaill, Nadia Zakaria, Marie-José Tassignon, Carina Koppen
PURPOSE: To evaluate the performance of a support vector machine algorithm that automatically and objectively identifies corneal patterns based on a combination of 22 parameters obtained from Pentacam measurements and to compare this method with other known keratoconus (KC) classification methods. METHODS: Pentacam data from 860 eyes were included in the study and divided into 5 groups: 454 KC, 67 forme fruste (FF), 28 astigmatic, 117 after refractive surgery (PR), and 194 normal eyes (N)...
June 2016: Cornea
Tony C Smith, Eibe Frank
This chapter presents an introduction to data mining with machine learning. It gives an overview of various types of machine learning, along with some examples. It explains how to download, install, and run the WEKA data mining toolkit on a simple data set, then proceeds to explain how one might approach a bioinformatics problem. Finally, it includes a brief summary of machine learning algorithms for other types of data mining problems, and provides suggestions about where to find additional information.
2016: Methods in Molecular Biology
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"