Read by QxMD icon Read


Morteza Hosseini, Diogo Pratas, Armando J Pinho
Summary: The ever-increasing growth of high-throughput sequencing technologies has led to a great acceleration of medical and biological research and discovery. As these platforms advance, the amount of information for diverse genomes increases at unprecedented rates. Confidentiality, integrity and authenticity of such genomic information should be ensured due to its extremely sensitive nature. In this paper, we propose Cryfa, a fast secure encryption tool for genomic data, namely in Fasta, Fastq, VCF, SAM and BAM formats, which is also capable of reducing the storage size of Fasta and Fastq files...
July 18, 2018: Bioinformatics
Justina Jankauskaite, Brian Jiménez-García, Justas Dapkunas, Juan Fernández-Recio, Iain H Moal
Motivation: Understanding the relationship between the sequence, structure, binding energy, binding kinetics and binding thermodynamics of protein-protein interactions is crucial to understanding cellular signaling, the assembly and regulation of molecular complexes, the mechanisms through which mutations lead to disease, and protein engineering. Results: We present SKEMPI 2.0, a major update to our database of binding free energy changes upon mutation for structurally resolved protein-protein interactions...
July 18, 2018: Bioinformatics
Shiquan Sun, Jiaqiang Zhu, Sahar Mozaffari, Carole Ober, Mengjie Chen, Xiang Zhou
Motivation: Genomic sequencing studies, including RNA sequencing and bisulfite sequencing studies, are becoming increasingly common and increasingly large. Large genomic sequencing studies open doors for accurate molecular trait heritability estimation and powerful differential analysis. Heritability estimation and differential analysis in sequencing studies requires the development of statistical methods that can properly account for the count nature of the sequencing data and that are computationally efficient for large data sets...
July 18, 2018: Bioinformatics
Alberto Valdeolivas, Laurent Tichit, Claire Navarro, Sophie Perrin, Gaëlle Odelin, Nicolas Levy, Pierre Cau, Elisabeth Remy, Anaïs Baudot
Motivation: Recentyears have witnessed anexponentialgrowthin thenumberof identified interactions between biological molecules. These interactions are usually represented as large and complex networks, callingforthedevelopmentof appropriated toolstoexploitthe functionalinformationtheycontain. Random walk with restart is the state-of-the-art guilt-by-association approach. It explores the network vicinity of gene/protein seeds to study their functions, based on the premise that nodes related to similar functions tend to lie close to each other in the networks...
July 18, 2018: Bioinformatics
Ben Langmead, Christopher Wilks, Valentin Antonescu, Rone Charles
Motivation: General-purpose processors can now contain many dozens of processor cores and support hundreds of simultaneous threads of execution. To make best use of these threads, genomics software must contend with new and subtle computer architecture issues. We discuss some of these and propose methods for improving thread scaling in tools that analyze each read independently, such as read aligners. Results: We implement these methods in new versions of Bowtie, Bowtie 2 and HISAT...
July 18, 2018: Bioinformatics
Ruben Sanchez-Garcia, C O S Sorzano, J M Carazo, Joan Segura
Motivation: Protein-Protein Interactions (PPI) are essentials for most cellular processes and thus, unveiling how proteins interact is a crucial question that can be better understood by identifying which residues are responsible for the interaction. Computational approaches are orders of magnitude cheaper and faster than experimental ones, leading to proliferation of multiple methods aimed to predict which residues belong to the interface of an interaction. Results: We present BIPSPI, a new machine learning-based method for the prediction of partner-specific protein-protein interaction sites...
July 18, 2018: Bioinformatics
Harun Mustafa, Ingo Schilken, Mikhail Karasikov, Carsten Eickhoff, Gunnar Rätsch, André Kahles
Motivation: Technological advancements in high-throughput DNA sequencing have led to an exponential growth of sequencing data being produced and stored as a byproduct of biomedical research. Despite its public availability, a majority of this data remains hard to query for the research community due to a lack of efficient data representation and indexing solutions. One of the available techniques to represent read data is a condensed form as an assembly graph. Such a representation contains all sequence information but does not store contextual information and metadata...
July 18, 2018: Bioinformatics
Jean-Michel Claverie, Thi Ngan Ta
Motivation: More than 20 years ago, our laboratory published an original statistical test (referred to as the Audic-Claverie (AC) test in the literature) to identify differentially expressed genes from the pairwise comparison of counts of "expressed sequence tags" determined in different conditions. Despite its antiquity and the publications of more sophisticated packages, this original publication continued to gather more than 200 citations per year, indicating the persistent usefulness of the simple AC test for the community...
July 18, 2018: Bioinformatics
Luca Parca, Bruno Ariano, Andrea Cabibbo, Marco Paoletti, Annalaura Tamburrini, Antonio Palmeri, Gabriele Ausiello, Manuela Helmer-Citterich
Motivation: Signaling and metabolic pathways are finely regulated by a network of protein phosphorylation events. Unraveling the nature of this intricate network, composed of kinases, target proteins and their interactions, is therefore of crucial importance. Although thousands of kinase-specific phosphorylations have been annotated in model organisms their kinase-target network is far from being complete, with less studied organisms lagging behind. Results: In this work we achieved an automated and accurate identification of kinase domains, inferring the residues that most likely contribute to peptide specificity...
July 17, 2018: Bioinformatics
Caralyn Reisle, Karen L Mungall, Caleb Choo, Daniel Paulino, Dustin W Bleile, Amir Muhammadzadeh, Andrew J Mungall, Richard A Moore, Inna Shlafman, Robin Coope, Stephen Pleasance, Yussanne Ma, Steven J M Jones
Summary: Reliably identifying genomic rearrangements and interpreting their impact is a key step in understanding their role in human cancers and inherited genetic diseases. Many short read algorithmic approaches exist but all have appreciable false negative rates. A common approach is to evaluate the union of multiple tools increasing sensitivity, followed by filtering to retain specificity. Here we describe an application framework for the rapid generation of structural variant consensus, unique in its ability to visualize the genetic impact and context as well as process both genome and transcriptome data...
July 17, 2018: Bioinformatics
Hadrien Gourlé, Oskar Karlsson-Lindsjö, Juliette Hayer, Erik Bongcam-Rudloff
Motivation: The accurate in-silico simulation of metagenomic datasets is of great importance for benchmarking bioinformatics tools as well as for experimental design. Users are dependant on large-scale simulation to not only design experiments and new projects but also for accurate estimation of computational needs within a project. Unfortunately, most current read simulators are either not suited for metagenomics, out of date or relatively poorly documented. In this article, we describe InSilicoSeq, a software package to simulate metagenomic Illumina sequencing data...
July 17, 2018: Bioinformatics
Patrick K Kimes, Alejandro Reyes
Summary: Benchmark studies are widely used to compare and evaluate tools developed for answering various biological questions. Despite the popularity of these comparisons, the implementation is often ad hoc, with little consistency across studies. To address this problem, we developed SummarizedBenchmark, an R package and framework for organizing and structuring benchmark comparisons. SummarizedBenchmark defines a general grammar for benchmarking and allows for easier setup and execution of benchmark comparisons, while improving the reproducibility and replicability of such comparisons...
July 17, 2018: Bioinformatics
Marcin Kowiel, Dariusz Brzezinski, Przemyslaw J Porebski, Ivan G Shabalin, Mariusz Jaskolski, Wladek Minor
Motivation: The correct identification of ligands in crystal structures of protein complexes is the cornerstone of structure-guided drug design. However, cognitive bias can sometimes mislead investigators into modeling fictitious compounds without solid support from the electron density maps. Ligand identification can be aided by automatic methods, but existing approaches are based on time-consuming iterative fitting. Results: Here we report a new machine learning algorithm called CheckMyBlob that identifies ligands from experimental electron density maps...
July 17, 2018: Bioinformatics
Emmanuel Paradis, Klaus Schliep
Summary: After more than fifteen years of existence, the R package ape has continuously grown its contents, and has been used by a growing community of users. The release of version 5.0 has marked a leap towards a modern software for evolutionary analyses. Efforts have been put to improve efficiency, flexibility, support for 'big data' (R's long vectors), ease of use, and quality check before a new release. These changes will hopefully make ape a useful software for the study of biodiversity and evolution in a context of increasing data quantity...
July 17, 2018: Bioinformatics
Shijia Zhu, Tongqi Qian, Yujin Hoshida, Yuan Shen, Jing Yu, Ke Hao
GIGSEA is implemented in R, and freely available at
July 13, 2018: Bioinformatics
Jamil Najafov, Ayaz Najafov
Motivation: Large-scale gene expression analysis is a valuable asset for data-driven hypothesis generation. However, the convoluted nature of large expression datasets often hinders extraction of meaningful biological information. Results: To this end, we developed GECO, a gene expression correlation analysis software that uses a genetic algorithm-driven approach to deconvolute complex expression datasets into two subpopulations that display positive and negative correlations between a pair of queried genes...
July 13, 2018: Bioinformatics
Rostam M Razban, Amy I Gilson, Niamh Durfee, Hendrik Strobelt, Kasper Dinkla, Jeong-Mo Choi, Hanspeter Pfister, Eugene I Shakhnovich
No abstract text is available yet for this article.
July 13, 2018: Bioinformatics
Renmin Han, Xiaohua Wan, Lun Li, Albert Lawrence, Peng Yang, Yu Li, Sheng Wang, Fei Sun, Zhiyong Liu, Xin Gao, Fa Zhang
Motivation: Dual-axis electron tomography is an important 3D macro-molecular structure reconstruction technology, which can reduce artifacts and suppress the effect of missing wedge. However, the fully automatic data process for dual-axis electron tomography still remains a challenge due to three difficulties: (i) how to track the mass of fiducial markers automatically; (ii) how to integrate the information from the two different tilt series; and (iii) how to cope with the inconsistency between the two different tilt series...
July 13, 2018: Bioinformatics
Rasmus Henningsson, Magnus Fontes
Motivation: High throughput biomedical measurements normally capture multiple overlaid biologically relevant signals and often also signals representing different types of technical artefacts like e.g. batch effects. Signal identification and decomposition are accordingly main objectives in statistical biomedical modeling and data analysis. Existing methods, aimed at signal reconstruction and deconvolution, in general, are either supervised, contain parameters that need to be estimated or present other types of ad hoc features...
July 13, 2018: Bioinformatics
Yunan Luo, Yun William Yu, Jianyang Zeng, Bonnie Berger, Jian Peng
Motivation: Vastly greater quantities of microbial genome data are being generated where environmental samples mix together the DNA from many different species. Here, we present Opal for metagenomic binning, the task of identifying the origin species of DNA sequencing reads. We introduce 'low-density' locality sensitive hashing to bioinformatics, with the addition of Gallager codes for even coverage, enabling quick and accurate metagenomic binning. Results: On public benchmarks, Opal halves the error on precision/recall (F1-score) as compared to both alignment-based and alignment-free methods for species classification...
July 13, 2018: Bioinformatics
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"