Read by QxMD icon Read

Text mining

Hakan Bulu, Dorothy A Sippo, Janie M Lee, Elizabeth S Burnside, Daniel L Rubin
After years of development, the RadLex terminology contains a large set of controlled terms for the radiology domain, but gaps still exist. We developed a data-driven approach to discover new terms for RadLex by mining a large corpus of radiology reports using natural language processing (NLP) methods. Our system, developed for mammography, discovers new candidate terms by analyzing noun phrases in free-text reports to extend the mammography part of RadLex. Our NLP system extracts noun phrases from free-text mammography reports and classifies these noun phrases as "Has Candidate RadLex Term" or "Does Not Have Candidate RadLex Term...
March 20, 2018: Journal of Digital Imaging: the Official Journal of the Society for Computer Applications in Radiology
Ibrahim O Alanazi, Sami A AlYahya, Esmaeil Ebrahimie, Manijeh Mohammadi-Dehcheshmeh
Exponentially growing scientific knowledge in scientific publications has resulted in the emergence of a new interdisciplinary science of literature mining. In text mining, the machine reads the published literature and transfers the discovered knowledge to mathematical-like formulas. In an integrative approach in this study, we used text mining in combination with network discovery, pathway analysis, and enrichment analysis of genomic regions for better understanding of biomarkers in lung cancer. Particular attention was paid to non-coding biomarkers...
March 16, 2018: Gene
Libing Shen, Qili Shi, Wenyuan Wang
The role of genetic components in cancer development is an area of interest for cancer biologists in general. Intriguingly, some genes have both oncogenic and tumor-suppressor functions. In this study, we systematically identified these genes through database search and text mining. We find that most of them are transcription factors or kinases and exhibit dual biological functions, e.g., that they both positively and negatively regulate transcription in cells. Some cancer types such as leukemia are over-represented by them, whereas some common cancer types such as lung cancer are under-represented by them...
March 13, 2018: Oncogenesis
Igor Mozetič, Luis Torgo, Vitor Cerqueira, Jasmina Smailović
Social media are becoming an increasingly important source of information about the public mood regarding issues such as elections, Brexit, stock market, etc. In this paper we focus on sentiment classification of Twitter data. Construction of sentiment classifiers is a standard text mining task, but here we address the question of how to properly evaluate them as there is no settled way to do so. Sentiment classes are ordered and unbalanced, and Twitter produces a stream of time-ordered data. The problem we address concerns the procedures used to obtain reliable estimates of performance measures, and whether the temporal ordering of the training and test data matters...
2018: PloS One
Mohammed Alsuhaibani, Danushka Bollegala, Takanori Maehara, Ken-Ichi Kawarabayashi
Methods for representing the meaning of words in vector spaces purely using the information distributed in text corpora have proved to be very valuable in various text mining and natural language processing (NLP) tasks. However, these methods still disregard the valuable semantic relational structure between words in co-occurring contexts. These beneficial semantic relational structures are contained in manually-created knowledge bases (KBs) such as ontologies and semantic lexicons, where the meanings of words are represented by defining the various relationships that exist among those words...
2018: PloS One
Juan Ruano, Francisco Gómez-García, Jesús Gay-Mimbrera, Macarena Aguilar-Luque, José Luis Fernández-Rueda, Jesús Fernández-Chaichio, Patricia Alcalde-Mellado, Pedro J Carmona-Fernandez, Juan Luis Sanz-Cabanillas, Isabel Viguera-Guerra, Francisco Franco-García, Manuel Cárdenas-Aranzana, José Luis Hernández Romero, Marcelino Gonzalez-Padilla, Beatriz Isla-Tejera, Antonio Velez Garcia-Nieto
BACKGROUND: Epidemiology and the reporting characteristics of systematic reviews (SRs) and meta-analyses (MAs) are well known. However, no study has analyzed the influence of protocol features on the probability that a study's results will be finally reported, thereby indirectly assessing the reporting bias of International Prospective Register of Systematic Reviews (PROSPERO) registration records. OBJECTIVE: The objective of this study is to explore which factors are associated with a higher probability that results derived from a non-Cochrane PROSPERO registration record for a systematic review will be finally reported as an original article in a scientific journal...
March 9, 2018: Systematic Reviews
H-M Müller, K M Van Auken, Y Li, P W Sternberg
BACKGROUND: The biomedical literature continues to grow at a rapid pace, making the challenge of knowledge retrieval and extraction ever greater. Tools that provide a means to search and mine the full text of literature thus represent an important way by which the efficiency of these processes can be improved. RESULTS: We describe the next generation of the Textpresso information retrieval system, Textpresso Central (TPC). TPC builds on the strengths of the original system by expanding the full text corpus to include the PubMed Central Open Access Subset (PMC OA), as well as the WormBase C...
March 9, 2018: BMC Bioinformatics
Núria Duran Adroher, Birgit Prodinger, Carolina Saskia Fellinghauer, Alan Tennant
OBJECTIVE: To examine the use of the term 'metric' in health and social sciences' literature, focusing on the interval scale implication of the term in Modern Test Theory (MTT). MATERIALS AND METHODS: A systematic search and review on MTT studies including 'metric' or 'interval scale' was performed in the health and social sciences literature. The search was restricted to 2001-2005 and 2011-2015. A Text Mining algorithm was employed to operationalize the eligibility criteria and to explore the uses of 'metric'...
2018: PloS One
Varsha D Badal, Petras J Kundrotas, Ilya A Vakser
BACKGROUND: Structural modeling of protein-protein interactions produces a large number of putative configurations of the protein complexes. Identification of the near-native models among them is a serious challenge. Publicly available results of biomedical research may provide constraints on the binding mode, which can be essential for the docking. Our text-mining (TM) tool, which extracts binding site residues from the PubMed abstracts, was successfully applied to protein docking (Badal et al...
March 5, 2018: BMC Bioinformatics
Fernando Aparicio, María Luz Morales-Botello, Margarita Rubio, Asunción Hernando, Rafael Muñoz, Hugo López-Fernández, Daniel Glez-Peña, Florentino Fdez-Riverola, Manuel de la Villa, Manuel Maña, Diego Gachet, Manuel de Buenaga
BACKGROUND: Student participation and the use of active methodologies in classroom learning are being increasingly emphasized. The use of intelligent systems can be of great help when designing and developing these types of activities. Recently, emerging disciplines such as 'educational data mining' and 'learning analytics and knowledge' have provided clear examples of the importance of the use of artificial intelligence techniques in education. OBJECTIVE: The main objective of this study was to gather expert opinions regarding the benefits of using complementary methods that are supported by intelligent systems, specifically, by intelligent information access systems, when processing texts written in natural language and the benefits of using these methods as companion tools to the learning activities that are employed by biomedical and health sciences teachers...
April 2018: International Journal of Medical Informatics
Kun Hwang, XiaJing Wu
The aim of this study was to determine how many papers have been retracted or withdrawn, and for what reason, in journals relating to plastic surgery.PubMed and SCOPUS were used, with the search terms (retracted OR withdrawn) AND (article OR publication OR paper) AND {(plastic surgery) OR (cosmetic surgery) OR (maxillofacial surgery) OR (craniofacial surgery)}. The papers were analyzed and classified according to the reason for retraction or withdrawal, journal name, publication year, and author. In the PubMed and SCOPUS, 227 and 114 titles were found, respectively, from which 34 duplicate titles were removed...
February 23, 2018: Journal of Craniofacial Surgery
Xingyun Hu, Yuyan Yue, Xianjia Peng
As part of a broader study of the environmental geochemistry behavior of vanadium (V), the release kinetics of V from the dissolution of natural vanadium titano-magnetite under environmentally relevant conditions was investigated. In both the acidic and basic domains, the V release rate was found to be proportional to fractional powers of hydrogen ion and dissolved oxygen activities. The dependence of the rate on dissolved oxygen can also be described in terms of the Langmuir adsorption model. The empirical rate equation is given by: r [Formula: see text] where, α=0...
February 2018: Journal of Environmental Sciences (China)
Jon Kirk, Nirav Shah, Braxton Noll, Craig B Stevens, Marshall Lawler, Farah B Mougeot, Jean-Luc C Mougeot
INTRODUCTION: Oral mucositis (OM) is a major dose-limiting side effect of chemotherapy and radiation used in cancer treatment. Due to the complex nature of OM, currently available drug-based treatments are of limited efficacy. OBJECTIVES: Our objectives were (i) to determine genes and molecular pathways associated with OM and wound healing using computational tools and publicly available data and (ii) to identify drugs formulated for topical use targeting the relevant OM molecular pathways...
February 23, 2018: Supportive Care in Cancer: Official Journal of the Multinational Association of Supportive Care in Cancer
Lei Zheng, Li Li, Yun Lu, Fangfang Jiang, Xiu-An Yang
This study is to investigate transcription factors involved in cisplatin resistance in ovarian cancer cells. The transcriptome of cisplatin resistant and sensitive A2780 epithelial ovarian cancer cells was obtained from GSE15372. Ovarian transcriptome data GSE62944 was downloaded from TCGA and applied for transcription regulatory network analysis. The analysis results were confirmed using quantitative polymerase chain reaction. The roles of SREBP2 in cisplatin-resistant cells were investigated by RNA inference and cell viability analysis...
January 1, 2018: Experimental Biology and Medicine
Jake R Saklatvala, Nick Dand, Michael A Simpson
The genetic diagnosis of rare monogenic diseases using exome/genome sequencing requires the true causal variant(s) to be identified from tens of thousands of observed variants. Typically a virtual gene panel approach is taken whereby only variants in genes known to cause phenotypes resembling the patient under investigation are considered. With the number of known monogenic gene-disease pairs exceeding 5000, manual curation of personalised virtual panels using exhaustive knowledge of the genetic basis of the human monogenic phenotypic spectrum is challenging...
February 20, 2018: Human Mutation
Albert Park, Mike Conway, Annie T Chen
Objectives: Social media, including online health communities, have become popular platforms for individuals to discuss health challenges and exchange social support with others. These platforms can provide support for individuals who are concerned about social stigma and discrimination associated with their illness. Although mental health conditions can share similar symptoms and even co-occur, the extent to which discussion topics in online mental health communities are similar, different, or overlapping is unknown...
January 2018: Computers in Human Behavior
David Westergaard, Hans-Henrik Stærfeldt, Christian Tønsberg, Lars Juhl Jensen, Søren Brunak
Across academia and industry, text mining has become a popular strategy for keeping up with the rapid growth of the scientific literature. Text mining of the scientific literature has mostly been carried out on collections of abstracts, due to their availability. Here we present an analysis of 15 million English scientific full-text articles published during the period 1823-2016. We describe the development in article length and publication sub-topics during these nearly 250 years. We showcase the potential of text mining by extracting published protein-protein, disease-gene, and protein subcellular associations using a named entity recognition system, and quantitatively report on their accuracy using gold standard benchmark data sets...
February 15, 2018: PLoS Computational Biology
Sarah R Hoffman, Anissa I Vines, Jacqueline R Halladay, Emily Pfaff, Lauren Schiff, Daniel Westreich, Aditi Sundaresan, La-Shell Johnson, Wanda K Nicholson
BACKGROUND: Symptomatic uterine fibroids, due to menorrhagia, pelvic pain, bulk symptoms or infertility, are a source of substantial morbidity for reproductive-age women. Comparing Treatment Options for Uterine Fibroids (COMPARE-UF) is a multi-site registry study to compare the effectiveness of hormonal or surgical fibroid treatments on women's perceptions of their quality of life. Electronic health record (EHR)-based algorithms are able to identify large numbers of women with fibroids, but additional work is needed to develop EHR algorithms that can identify women with symptomatic fibroids to optimize fibroid research...
February 9, 2018: American Journal of Obstetrics and Gynecology
Hang J Kim, Zhenning Yu, Andrew Lawson, Hongyu Zhao, Dongjun Chung, Oliver Stegle
Availability: graph-GPA is implemented as an R package 'GGPA', which is publicly available at DDNet, a web interface to query diseases of interest and download a prior disease graph obtained from a text mining of biomedical literature, is publicly available at Contact: Supplementary information: Supplementary data are available at Bioinformatics online.
February 8, 2018: Bioinformatics
Stuart McTaggart, Clifford Nangle, Jacqueline Caldwell, Samantha Alvarez-Madrazo, Helen Colhoun, Marion Bennie
Background: Efficient generation of structured dose instructions that enable researchers to calculate drug exposure is central to pharmacoepidemiology studies. Our aim was to design and test an algorithm to codify dose instructions, applied to the NHS Scotland Prescribing Information System (PIS) that records about 100 million prescriptions per annum. Methods: A natural language processing (NLP) algorithm was developed that enabled free-text dose instructions to be represented by three attributes - quantity, frequency and qualifier - specified by three, three and two variables, respectively...
February 6, 2018: International Journal of Epidemiology
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"