keyword
MENU ▼
Read by QxMD icon Read
search

Bioinformatics python

keyword
https://www.readbyqxmd.com/read/29329398/lenup-learning-nucleosome-positioning-from-dna-sequences-with-improved-convolutional-neural-networks
#1
Juhua Zhang, Wenbo Peng, Lei Wang
Motivation: Nucleosome positioning plays significant roles in proper genome packing and its accessibility to execute transcription regulation. Despite a multitude of nucleosome positioning resources available on line including experimental datasets of genome-wide nucleosome occupancy profiles and computational tools to the analysis on these data, the complex language of eukaryotic Nucleosome positioning remains incompletely understood. Results: Here, we address this challenge using an approach based on a state-of-the-art machine learning method...
January 10, 2018: Bioinformatics
https://www.readbyqxmd.com/read/29300846/chopstitch-exon-annotation-and-splice-graph-construction-using-transcriptome-assembly-and-whole-genome-sequencing-data
#2
Hamza Khan, Hamid Mohamadi, Benjamin P Vandervalk, Rene L Warren, Justin Chu, Inanc Birol
Motivation: Sequencing studies on non-model organisms often interrogate both genomes and transcriptomes with massive amounts of short sequences. Such studies require de novo analysis tools and techniques, when the species and closely related species lack high quality reference resources. For certain applications such as de novo annotation, information on putative exons and alternative splicing may be desirable. Results: Here we present ChopStitch, a new method for finding putative exons de novo and constructing splice graphs using an assembled transcriptome and whole genome shotgun sequencing (WGSS) data...
December 29, 2017: Bioinformatics
https://www.readbyqxmd.com/read/29228182/peith-%C3%AE-perfecting-experiments-with-information-theory-in-python-with-gpu-support
#3
Leander Dony, Jonas Mackerodt, Scott Ward, Sarah Filippi, Michael P H Stumpf, Juliane Liepe
Motivation: Different experiments provide differing levels of information about a biological system. This makes it difficult, a priori, to select one of them beyond mere speculation and/or belief, especially when resources are limited. With the increasing diversity of experimental approaches and general advances in quantitative systems biology, methods that inform us about the information content that a given experiment carries about the question we want to answer, become crucial. Results: PEITH(Θ) is a general purpose, Python framework for experimental design in systems biology...
December 7, 2017: Bioinformatics
https://www.readbyqxmd.com/read/29212440/constax-a-tool-for-improved-taxonomic-resolution-of-environmental-fungal-its-sequences
#4
Kristi Gdanetz, Gian Maria Niccolò Benucci, Natalie Vande Pol, Gregory Bonito
BACKGROUND: One of the most crucial steps in high-throughput sequence-based microbiome studies is the taxonomic assignment of sequences belonging to operational taxonomic units (OTUs). Without taxonomic classification, functional and biological information of microbial communities cannot be inferred or interpreted. The internal transcribed spacer (ITS) region of the ribosomal DNA is the conventional marker region for fungal community studies. While bioinformatics pipelines that cluster reads into OTUs have received much attention in the literature, less attention has been given to the taxonomic classification of these sequences, upon which biological inference is dependent...
December 6, 2017: BMC Bioinformatics
https://www.readbyqxmd.com/read/29187143/dmrfinder-efficiently-identifying-differentially-methylated-regions-from-methylc-seq-data
#5
John M Gaspar, Ronald P Hart
BACKGROUND: DNA methylation is an epigenetic modification that is studied at a single-base resolution with bisulfite treatment followed by high-throughput sequencing. After alignment of the sequence reads to a reference genome, methylation counts are analyzed to determine genomic regions that are differentially methylated between two or more biological conditions. Even though a variety of software packages is available for different aspects of the bioinformatics analysis, they often produce results that are biased or require excessive computational requirements...
November 29, 2017: BMC Bioinformatics
https://www.readbyqxmd.com/read/29186333/fun-a-framework-for-interactive-visualizations-of-large-high-dimensional-datasets-on-the-web
#6
Daniel Probst, Jean-Louis Reymond
Motivation: During the past decade, big data has become a major tool in scientific endeavors. While statistical methods and algorithms are well-suited for analyzing and summarizing enormous amounts of data, the results do not allow for a visual inspection of the entire data. Current scientific software, including R packages and Python libraries such as ggplot2, matplotlib, and plot.ly, do not support interactive visualizations of datasets exceeding 100,000 data points on the web. Other solutions enable the web-based visualization of big data only through data reduction or statistical representations...
November 24, 2017: Bioinformatics
https://www.readbyqxmd.com/read/29158538/pinapl-py-a-comprehensive-web-application-for-the-analysis-of-crispr-cas9-screens
#7
Philipp N Spahn, Tyler Bath, Ryan J Weiss, Jihoon Kim, Jeffrey D Esko, Nathan E Lewis, Olivier Harismendy
Large-scale genetic screens using CRISPR/Cas9 technology have emerged as a major tool for functional genomics. With its increased popularity, experimental biologists frequently acquire large sequencing datasets for which they often do not have an easy analysis option. While a few bioinformatic tools have been developed for this purpose, their utility is still hindered either due to limited functionality or the requirement of bioinformatic expertise. To make sequencing data analysis of CRISPR/Cas9 screens more accessible to a wide range of scientists, we developed a Platform-independent Analysis of Pooled Screens using Python (PinAPL-Py), which is operated as an intuitive web-service...
November 20, 2017: Scientific Reports
https://www.readbyqxmd.com/read/29155928/modeling-positional-effects-of-regulatory-sequences-with-spline-transformations-increases-prediction-accuracy-of-deep-neural-networks
#8
Žiga Avsec, Mohammadamin Barekatain, Jun Cheng, Julien Gagneur
Motivation: Regulatory sequences are not solely defined by their nucleic acid sequence but also by their relative distances to genomic landmarks such as transcription start site, exon boundaries, or polyadenylation site. Deep learning has become the approach of choice for modeling regulatory sequences because of its strength to learn complex sequence features. However, modeling relative distances to genomic landmarks in deep neural networks has not been addressed. Results: Here we developed spline transformation, a neural network module based on splines to flexibly and robustly model distances...
November 16, 2017: Bioinformatics
https://www.readbyqxmd.com/read/29126132/cost-effective-and-accurate-method-of-measuring-fetal-fraction-using-snp-imputation
#9
Minjeong Kim, Jai-Hoon Kim, Kangseok Kim, Sunshin Kim
Motivation: With the discovery of cell-free fetal DNA in maternal blood, the demand for non-invasive prenatal testing (NIPT) has been increasing. To obtain reliable NIPT results, it is important to accurately estimate the fetal fraction. In this study, we propose an accurate and cost-effective method for measuring fetal fractions using single-nucleotide polymorphisms (SNPs). Results: A total of 84 samples were sequenced via semiconductor sequencing using a 0.3x sequencing coverage...
November 8, 2017: Bioinformatics
https://www.readbyqxmd.com/read/29121165/perf-an-exhaustive-algorithm-for-ultra-fast-and-efficient-identification-of-microsatellites-from-large-dna-sequences
#10
Akshay Kumar Avvaru, Divya Tej Sowpati, Rakesh Kumar Mishra
Motivation: Microsatellites or Simple Sequence Repeats (SSRs) are short tandem repeats of DNA motifs present in all genomes. They have long been used for a variety of purposes in the areas of population genetics, genotyping, marker-assisted selection, and forensics. Numerous studies have highlighted their functional roles in genome organization and gene regulation. Though several tools are currently available to identify SSRs from genomic sequences, they have significant limitations. Results: We present a novel algorithm called PERF for extremely fast and comprehensive identification of microsatellites from DNA sequences of any size...
November 7, 2017: Bioinformatics
https://www.readbyqxmd.com/read/29106469/dfast-a-flexible-prokaryotic-genome-annotation-pipeline-for-faster-genome-publication
#11
Yasuhiro Tanizawa, Takatomo Fujisawa, Yasukazu Nakamura
Summary: We developed a prokaryotic genome annotation pipeline, DFAST, that also supports genome submission to public sequence databases. DFAST was originally started as an on-line annotation server, and to date, over 7,000 jobs have been processed since its first launch in 2016. Here, we present a newly implemented background annotation engine for DFAST, which is also available as a standalone command-line program. The new engine can annotate a typical-sized bacterial genome within 10 minutes, with rich information such as pseudogenes, translation exceptions, and orthologous gene assignment between given reference genomes...
November 2, 2017: Bioinformatics
https://www.readbyqxmd.com/read/29106441/orchid-a-novel-management-annotation-and-machine-learning-framework-for-analyzing-cancer-mutations
#12
Clinton L Cario, John S Witte
Motivation: As whole-genome tumor sequence and biological annotation datasets grow in size, number, and content, there is an increasing basic science and clinical need for efficient and accurate data management and analysis software. With the emergence of increasingly sophisticated data stores, execution environments, and machine learning algorithms, there is also a need for the integration of functionality across frameworks. Results: We present orchid, a python based software package for the management, annotation, and machine learning of cancer mutations...
November 2, 2017: Bioinformatics
https://www.readbyqxmd.com/read/29077810/isescan-automated-identification-of-insertion-sequence-elements-in-prokaryotic-genomes
#13
Zhiqun Xie, Haixu Tang
Motivation: The insertion sequence (IS) elements are the smallest but most abundant autonomous transposable elements in prokaryotic genomes, which play a key role in prokaryotic genome organization and evolution. With the fast growing genomic data, it is becoming increasingly critical for biology researchers to be able to accurately and automatically annotate ISs in prokaryotic genome sequences. The available automatic IS annotation systems are either providing only incomplete IS annotation or relying on the availability of existing genome annotations...
November 1, 2017: Bioinformatics
https://www.readbyqxmd.com/read/29048524/krait-an-ultrafast-tool-for-genome-wide-survey-of-microsatellites-and-primer-design
#14
Lianming Du, Chi Zhang, Qin Liu, Xiuyue Zhang, Bisong Yue
Summary: Microsatellites are found to be related with various diseases and widely used in population genetics as genetic markers. However, it remains a challenge to identify microsatellite from large genome and screen microsatellites for primer design from a huge result dataset. Here, we present Krait, a robust and flexible tool for fast investigation of microsatellites in DNA sequences. Krait is designed to identify all types of perfect or imperfect microsatellites on a whole genomic sequence, and is also applicable to identification of compound microsatellites...
October 18, 2017: Bioinformatics
https://www.readbyqxmd.com/read/29048466/pybel-a-computational-framework-for-biological-expression-language
#15
Charles Tapley Hoyt, Andrej Konotopez, Christian Ebeling
Summary: Biological Expression Language (BEL) assembles knowledge networks from biological relations across multiple modes and scales. Here, we present PyBEL; a software package for parsing, validating, converting, storing, querying, and visualizing networks encoded in BEL. Availability: PyBEL is implemented in platform-independent, universal Python code. Its source is distributed under the Apache 2.0 License at https://github.com/pybel. Contact: charles...
October 18, 2017: Bioinformatics
https://www.readbyqxmd.com/read/29036584/cgheliparm-analysis-of-dsdna-helical-parameters-for-coarse-grained-martini-molecular-dynamics-simulations
#16
Ignacio Faustino, S J Marrink
Summary: We introduce cgHeliParm, a python program that provides the conformational analysis of Martini-based coarse-grained double strand DNA molecules. The software calculates the helical parameters such as base, base pair and base pair step parameters. cgHeliParm can be used for the analysis of coarse grain Martini molecular dynamics trajectories without transformation into atomistic models. Availability and implementation: This package works with Python 2.7 on MacOS and Linux...
July 13, 2017: Bioinformatics
https://www.readbyqxmd.com/read/29036558/methrafo-medip-seq-methylation-estimate-using-a-random-forest-regressor
#17
Jun Ding, Ziv Bar-Joseph
Motivation: Profiling of genome wide DNA methylation is now routinely performed when studying development, cancer and several other biological processes. Although Whole genome Bisulfite Sequencing provides high-quality methylation measurements at the resolution of nucleotides, it is relatively costly and so several studies have used alternative methods for such profiling. One of the most widely used low cost alternatives is MeDIP-Seq. However, MeDIP-Seq is biased for CpG enriched regions and thus its results need to be corrected in order to determine accurate methylation levels...
November 1, 2017: Bioinformatics
https://www.readbyqxmd.com/read/29036557/standardizing-biomass-reactions-and-ensuring-complete-mass-balance-in-genome-scale-metabolic-models
#18
Siu H J Chan, Jingyi Cai, Lin Wang, Margaret N Simons-Senftle, Costas D Maranas
Motivation: In a genome-scale metabolic model, the biomass produced is defined to have a molecular weight (MW) of 1 g mmol-1. This is critical for correctly predicting growth yields, contrasting multiple models and more importantly modeling microbial communities. However, the standard is rarely verified in the current practice and the chemical formulae of biomass components such as proteins, nucleic acids and lipids are often represented by undefined side groups (e.g. X, R). Results: We introduced a systematic procedure for checking the biomass weight and ensuring complete mass balance of a model...
November 15, 2017: Bioinformatics
https://www.readbyqxmd.com/read/29036452/a-nonparametric-significance-test-for-sampled-networks
#19
Andrew Elliott, Elizabeth Leicht, Alan Whitmore, Gesine Reinert, Felix Reed-Tsochas
Motivation: Our work is motivated by an interest in constructing a protein-protein interaction network that captures key features associated with Parkinson's disease. While there is an abundance of subnetwork construction methods available, it is often far from obvious which subnetwork is the most suitable starting point for further investigation. Results: We provide a method to assess whether a subnetwork constructed from a seed list (a list of nodes known to be important in the area of interest) differs significantly from a randomly generated subnetwork...
July 7, 2017: Bioinformatics
https://www.readbyqxmd.com/read/29036291/phylotyper-in-silico-predictor-of-gene-subtypes
#20
Matthew D Whiteside, Victor P J Gannon, Chad R Laing
Summary: Whole genome sequencing (WGS) is being adopted in public health for improved surveillance and outbreak analysis. In public health, subtyping has been used to infer phenotypes and distinguish bacterial strain groups. In silico tools that predict subtypes from sequences data are needed to transition historical data to WGS-based protocols. Phylotyper is a novel solution for in silico subtype prediction from gene sequences. Designed for incorporation into WGS pipelines, it is a general prediction tool that can be applied to different subtype schemes...
November 15, 2017: Bioinformatics
keyword
keyword
66424
1
2
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read
×

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"