Read by QxMD icon Read

Journal of Cheminformatics

Wenyi Wang, Xiliang Yan, Linlin Zhao, Daniel P Russo, Shenqing Wang, Yin Liu, Alexander Sedykh, Xiaoli Zhao, Bing Yan, Hao Zhu
To facilitate the development of new nanomaterials, especially nanomedicines, a novel computational approach was developed to precisely predict the hydrophobicity of gold nanoparticles (GNPs). The core of this study was to develop a large virtual gold nanoparticle (vGNP) library with computational nanostructure simulations. Based on the vGNP library, a nanohydrophobicity model was developed and then validated against externally synthesized and tested GNPs. This approach and resulted model is an efficient and effective universal tool to visualize and predict critical physicochemical properties of new nanomaterials before synthesis, guiding nanomaterial design...
January 18, 2019: Journal of Cheminformatics
Shuangjia Zheng, Xin Yan, Qiong Gu, Yuedong Yang, Yunfei Du, Yutong Lu, Jun Xu
Biogenic compounds are important materials for drug discovery and chemical biology. In this work, we report a quasi-biogenic molecule generator (QBMG) to compose virtual quasi-biogenic compound libraries by means of gated recurrent unit recurrent neural networks. The library includes stereo-chemical properties, which are crucial features of natural products. QMBG can reproduce the property distribution of the underlying training set, while being able to generate realistic, novel molecules outside of the training set...
January 17, 2019: Journal of Cheminformatics
Nicolas Bosc, Francis Atkinson, Eloy Felix, Anna Gaulton, Anne Hersey, Andrew R Leach
Structure-activity relationship modelling is frequently used in the early stage of drug discovery to assess the activity of a compound on one or several targets, and can also be used to assess the interaction of compounds with liability targets. QSAR models have been used for these and related applications over many years, with good success. Conformal prediction is a relatively new QSAR approach that provides information on the certainty of a prediction, and so helps in decision-making. However, it is not always clear how best to make use of this additional information...
January 10, 2019: Journal of Cheminformatics
Wahed Hemati, Alexander Mehler
BACKGROUND: Chemical and biomedical named entity recognition (NER) is an essential preprocessing task in natural language processing. The identification and extraction of named entities from scientific articles is also attracting increasing interest in many scientific disciplines. Locating chemical named entities in the literature is an essential step in chemical text mining pipelines for identifying chemical mentions, their properties, and relations as discussed in the literature. In this work, we describe an approach to the BioCreative V...
January 10, 2019: Journal of Cheminformatics
Yannick Djoumbou-Feunang, Jarlei Fiamoncini, Alberto Gil-de-la-Fuente, Russell Greiner, Claudine Manach, David S Wishart
BACKGROUND: A number of computational tools for metabolism prediction have been developed over the last 20 years to predict the structures of small molecules undergoing biological transformation or environmental degradation. These tools were largely developed to facilitate absorption, distribution, metabolism, excretion, and toxicity (ADMET) studies, although there is now a growing interest in using such tools to facilitate metabolomics and exposomics studies. However, their use and widespread adoption is still hampered by several factors, including their limited scope, breath of coverage, availability, and performance...
January 5, 2019: Journal of Cheminformatics
Ian A Watson, Jibo Wang, Christos A Nicolaou
The need for synthetic route design arises frequently in discovery-oriented chemistry organizations. While traditionally finding solutions to this problem has been the domain of human experts, several computational approaches, aided by the algorithmic advances and the availability of large reaction collections, have recently been reported. Herein we present our own implementation of a retrosynthetic analysis method and demonstrate its capabilities in an attempt to identify synthetic routes for a collection of approved drugs...
January 3, 2019: Journal of Cheminformatics
Sérgio Matos
The need to efficiently find and extract information from the continuously growing biomedical literature has led to the development of various annotation tools aimed at identifying mentions of entities and relations. Many of these tools have been integrated in user-friendly applications facilitating their use by non-expert text miners and database curators. In this paper we describe the latest version of Neji, a web-services ready text processing and annotation framework. The modular and flexible architecture facilitates adaptation to different annotation requirements, while the built-in web services allow its integration in external tools and text mining pipelines...
December 21, 2018: Journal of Cheminformatics
Daniel Probst, Jean-Louis Reymond
BACKGROUND: Among the various molecular fingerprints available to describe small organic molecules, extended connectivity fingerprint, up to four bonds (ECFP4) performs best in benchmarking drug analog recovery studies as it encodes substructures with a high level of detail. Unfortunately, ECFP4 requires high dimensional representations (≥ 1024D) to perform well, resulting in ECFP4 nearest neighbor searches in very large databases such as GDB, PubChem or ZINC to perform very slowly due to the curse of dimensionality...
December 18, 2018: Journal of Cheminformatics
Frédéric Wieber, Alejandro Pisanty, Alexandre Hocquet
The Computational Chemistry List is a mailing list, portal, and community which brings together people interested in computational chemistry, mostly practitioners. It was formed in 1991 and continues to exist as a vibrant discussion space, highly valued by its members, and serving both its original and new functions. Its duration has been unusual for online communities. We analyze some of its characteristics, the reasons for its duration, value, and resilience, the ways it embodies and preceded the affordances of online communities recognized elsewhere long after its foundations, and project some aspects into the future...
December 18, 2018: Journal of Cheminformatics
Ling Luo, Zhihao Yang, Pei Yang, Yin Zhang, Lei Wang, Jian Wang, Hongfei Lin
In biomedical research, patents contain the significant amount of information, and biomedical text mining has received much attention in patents recently. To accelerate the development of biomedical text mining for patents, the BioCreative V.5 challenge organized three tracks, i.e., chemical entity mention recognition (CEMP), gene and protein related object recognition (GPRO) and technical interoperability and performance of annotation servers, to focus on biomedical entity recognition in patents. This paper describes our neural network approach for the CEMP and GPRO tracks...
December 18, 2018: Journal of Cheminformatics
Po-Ting Lai, Ming-Siang Huang, Ting-Hao Yang, Wen-Lian Hsu, Richard Tzong-Han Tsai
The large number of chemical and pharmaceutical patents has attracted researchers doing biomedical text mining to extract valuable information such as chemicals, genes and gene products. To facilitate gene and gene product annotations in patents, BioCreative V.5 organized a gene- and protein-related object (GPRO) recognition task, in which participants were assigned to identify GPRO mentions and determine whether they could be linked to their unique biological database records. In this paper, we describe the system constructed for this task...
December 17, 2018: Journal of Cheminformatics
Jeffrey Plante, Stephane Werner
The partition coefficient between octanol and water (logP) has been an important descriptor in QSAR predictions for many years and therefore the prediction of logP has been examined countless times. One of the best performing models is to predict the logP using multiple methods and average the result. We have used those averaged predictions to develop a training-set which was able to distil the information present across the disparate logP methods into one single model. Our model was built using extendable atom-types, where each atom is distilled down into a 6 digit number, and each individual atom is assumed to have a small additive effect on the overall logP of the molecule...
December 14, 2018: Journal of Cheminformatics
Johannes Kirschnick, Philippe Thomas, Roland Roller, Leonhard Hennig
Recent years showed a strong increase in biomedical sciences and an inherent increase in publication volume. Extraction of specific information from these sources requires highly sophisticated text mining and information extraction tools. However, the integration of freely available tools into customized workflows is often cumbersome and difficult. We describe SIA (Scalable Interoperable Annotation Server), our contribution to the BeCalm-Technical interoperability and performance of annotation servers (BeCalm-TIPS) task, a scalable, extensible, and robust annotation service...
December 14, 2018: Journal of Cheminformatics
Hio Kuan Tai, Siti Azma Jusoh, Shirley W I Siu
BACKGROUND: Protein-ligand docking programs are routinely used in structure-based drug design to find the optimal binding pose of a ligand in the protein's active site. These programs are also used to identify potential drug candidates by ranking large sets of compounds. As more accurate and efficient docking programs are always desirable, constant efforts focus on developing better docking algorithms or improving the scoring function. Recently, chaotic maps have emerged as a promising approach to improve the search behavior of optimization algorithms in terms of search diversity and convergence speed...
December 14, 2018: Journal of Cheminformatics
Domenico Gadaleta, Anna Lombardo, Cosimo Toma, Emilio Benfenati
The quality of data used for QSAR model derivation is extremely important as it strongly affects the final robustness and predictive power of the model. Ambiguous or wrong structures need to be carefully checked, because they lead to errors in calculation of descriptors, hence leading to meaningless results. The increasing amounts of data, however, have often made it hard to check of very large databases manually. In the light of this, we designed and implemented a semi-automated workflow integrating structural data retrieval from several web-based databases, automated comparison of these data, chemical structure cleaning, selection and standardization of data into a consistent, ready-to-use format that can be employed for modeling...
December 10, 2018: Journal of Cheminformatics
Peter Corbett, John Boyle
Chemical named entity recognition (NER) has traditionally been dominated by conditional random fields (CRF)-based approaches but given the success of the artificial neural network techniques known as "deep learning" we decided to examine them as an alternative to CRFs. We present here several chemical named entity recognition systems. The first system translates the traditional CRF-based idioms into a deep learning framework, using rich per-token features and neural word embeddings, and producing a sequence of tags using bidirectional long short term memory (LSTM) networks-a type of recurrent neural net...
December 6, 2018: Journal of Cheminformatics
Francisco M Couto, Andre Lamurias
Named-entity recognition aims at identifying the fragments of text that mention entities of interest, that afterwards could be linked to a knowledge base where those entities are described. This manuscript presents our minimal named-entity recognition and linking tool (MER), designed with flexibility, autonomy and efficiency in mind. To annotate a given text, MER only requires: (1) a lexicon (text file) with the list of terms representing the entities of interest; (2) optionally a tab-separated values file with a link for each term; (3) and a Unix shell...
December 5, 2018: Journal of Cheminformatics
Jeremy R Ash, Jacqueline M Hughes-Oliver
The goal of chemmodlab is to streamline the fitting and assessment pipeline for many machine learning models in R, making it easy for researchers to compare the utility of these models. While focused on implementing methods for model fitting and assessment that have been accepted by experts in the cheminformatics field, all of the methods in chemmodlab have broad utility for the machine learning community. chemmodlab contains several assessment utilities, including a plotting function that constructs accumulation curves and a function that computes many performance measures...
November 28, 2018: Journal of Cheminformatics
Norberto Sánchez-Cruz, José L Medina-Franco
BACKGROUND: Simplified representation of compound databases has several applications in cheminformatics. Herein, we introduce an alternative and general method to build single fingerprint representations of compound databases. The approach is inspired on the previously published modal fingerprints that are aimed to capture the most significant bits of a fingerprint representation for a compound data set. The novelty of the herein proposed statistical-based database fingerprint (SB-DFP) is that it is generated based on binomial proportions comparisons taking as reference the distribution of "1" bits on a large representative set of the chemical space...
November 22, 2018: Journal of Cheminformatics
Raghuram Srinivas, Pavel V Klimovich, Eric C Larson
Current ligand-based machine learning methods in virtual screening rely heavily on molecular fingerprinting for preprocessing, i.e., explicit description of ligands' structural and physicochemical properties in a vectorized form. Of particular importance to current methods are the extent to which molecular fingerprints describe a particular ligand and what metric sufficiently captures similarity among ligands. In this work, we propose and evaluate methods that do not require explicit feature vectorization through fingerprinting, but, instead, provide implicit descriptors based only on other known assays...
November 22, 2018: Journal of Cheminformatics
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"