Read by QxMD icon Read

Current Protocols in Bioinformatics

Jianguo Xia, David S Wishart
MetaboAnalyst ( is a comprehensive Web application for metabolomic data analysis and interpretation. MetaboAnalyst handles most of the common metabolomic data types from most kinds of metabolomics platforms (MS and NMR) for most kinds of metabolomics experiments (targeted, untargeted, quantitative). In addition to providing a variety of data processing and normalization procedures, MetaboAnalyst also supports a number of data analysis and data visualization tasks using a range of univariate, multivariate methods such as PCA (principal component analysis), PLS-DA (partial least squares discriminant analysis), heatmap clustering and machine learning methods...
2016: Current Protocols in Bioinformatics
Mark E Adamo, Scott A Gerber
MS/MS database search algorithms derive a set of candidate peptide sequences from in silico digest of a protein sequence database, and compute theoretical fragmentation patterns to match these candidates against observed MS/MS spectra. The original Tempest publication described these operations mapped to a CPU-GPU model, in which the CPU (central processing unit) generates peptide candidates that are asynchronously sent to a discrete GPU (graphics processing unit) to be scored against experimental spectra in parallel...
2016: Current Protocols in Bioinformatics
Alisha Parveen, Norbert Gretz, Harsh Dweep
miRWalk2.0 ( is a freely accessible, regularly updated comprehensive archive supplying the largest available collection of predicted and experimentally verified miRNA-target interactions, with various novel and unique features to assist the scientific community. Approximately 949 million interactions between 11,748 miRNAs, 308,700 genes, and 68,460 lncRNAs are documented in miRWalk2.0 with 5,146,217 different kinds of identifiers to offer a one-stop site to collect an abundance of information...
2016: Current Protocols in Bioinformatics
Maria D Paraskevopoulou, Ioannis S Vlachos, Artemis G Hatzigeorgiou
microRNAs (miRNAs) are short non-coding RNAs (∼22 nts) present in animals, plants, and viruses. They are considered central post-transcriptional regulators of gene expression and are key components in a great number of physiological and pathological conditions. The accurate characterization of their targets is considered essential to a series of applications and basic or applied research settings. DIANA-TarBase ( was initially launched in 2006. It is a reference repository indexing experimentally derived miRNA-gene interactions in different cell types, tissues, and conditions across numerous species...
2016: Current Protocols in Bioinformatics
Luigi Di Costanzo, Sutapa Ghosh, Christine Zardecki, Stephen K Burley
The Protein Data Bank (PDB) archive is the worldwide repository of experimentally determined three-dimensional structures of large biological molecules found in all three kingdoms of life. Atomic-level structures of these proteins, nucleic acids, and complex assemblies thereof are central to research and education in molecular, cellular, and organismal biology, biochemistry, biophysics, materials science, bioengineering, ecology, and medicine. Several types of information are associated with each PDB archival entry, including atomic coordinates, primary experimental data, polymer sequence(s), and summary metadata...
2016: Current Protocols in Bioinformatics
Joseph P Bielawski, Jennifer L Baker, Joseph Mingrone
This unit provides protocols for using the CODEML program from the PAML package to make inferences about episodic natural selection in protein-coding sequences. The protocols cover inference tasks such as maximum likelihood estimation of selection intensity, testing the hypothesis of episodic positive selection, and identifying sites with a history of episodic evolution. We provide protocols for using the rich set of models implemented in CODEML to assess robustness, and for using bootstrapping to assess if the requirements for reliable statistical inference have been met...
2016: Current Protocols in Bioinformatics
Benjamin Webb, Andrej Sali
Comparative protein structure modeling predicts the three-dimensional structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and how to use the ModBase database of such models, and discusses all four steps of comparative modeling, frequently observed errors, and some applications...
2016: Current Protocols in Bioinformatics
David S Wishart, Anthony Wu
DrugBank is a fully curated drug and drug target database that contains 8174 drug entries including 1944 FDA approved small-molecule drugs, 198 FDA-approved biotech (protein/peptide) drugs, 93 nutraceuticals, and over 6000 experimental drugs. Additionally, 4300 non-redundant protein (i.e., drug target/enzyme/transporter/carrier) sequences are linked to these drug entries. DrugBank is primarily focused on providing both the query/search tools and biophysical data needed to facilitate drug discovery and drug development...
2016: Current Protocols in Bioinformatics
Lars Barquist, Sarah W Burge, Paul P Gardner
Emerging high-throughput technologies have led to a deluge of putative non-coding RNA (ncRNA) sequences identified in a wide variety of organisms. Systematic characterization of these transcripts will be a tremendous challenge. Homology detection is critical to making maximal use of functional information gathered about ncRNAs: identifying homologous sequence allows us to transfer information gathered in one organism to another quickly and with a high degree of confidence. ncRNA presents a challenge for homology detection, as the primary sequence is often poorly conserved and de novo secondary structure prediction and search remain difficult...
2016: Current Protocols in Bioinformatics
Gil Stelzer, Naomi Rosen, Inbar Plaschkes, Shahar Zimmerman, Michal Twik, Simon Fishilevich, Tsippi Iny Stein, Ron Nudel, Iris Lieder, Yaron Mazor, Sergey Kaplan, Dvir Dahary, David Warshawsky, Yaron Guan-Golan, Asher Kohn, Noa Rappaport, Marilyn Safran, Doron Lancet
GeneCards, the human gene compendium, enables researchers to effectively navigate and inter-relate the wide universe of human genes, diseases, variants, proteins, cells, and biological pathways. Our recently launched Version 4 has a revamped infrastructure facilitating faster data updates, better-targeted data queries, and friendlier user experience. It also provides a stronger foundation for the GeneCards suite of companion databases and analysis tools. Improved data unification includes gene-disease links via MalaCards and merged biological pathways via PathCards, as well as drug information and proteome expression...
2016: Current Protocols in Bioinformatics
William R Pearson
The FASTA programs provide a comprehensive set of rapid similarity searching tools (fasta36, fastx36, tfastx36, fasty36, tfasty36), similar to those provided by the BLAST package, as well as programs for slower, optimal, local, and global similarity searches (ssearch36, ggsearch36), and for searching with short peptides and oligonucleotides (fasts36, fastm36). The FASTA programs use an empirical strategy for estimating statistical significance that accommodates a range of similarity scoring matrices and gap penalties, improving alignment boundary accuracy and search sensitivity...
2016: Current Protocols in Bioinformatics
Namrata S Kale, Kenneth Haug, Pablo Conesa, Kalaivani Jayseelan, Pablo Moreno, Philippe Rocca-Serra, Venkata Chandrasekhar Nainala, Rachel A Spicer, Mark Williams, Xuefei Li, Reza M Salek, Julian L Griffin, Christoph Steinbeck
MetaboLights is the first general purpose, open-access database repository for cross-platform and cross-species metabolomics research at the European Bioinformatics Institute (EMBL-EBI). Based upon the open-source ISA framework, MetaboLights provides Metabolomics Standard Initiative (MSI) compliant metadata and raw experimental data associated with metabolomics experiments. Users can upload their study datasets into the MetaboLights Repository. These studies are then automatically assigned a stable and unique identifier (e...
2016: Current Protocols in Bioinformatics
David S Wishart
Cheminformatics is a field of information technology that focuses on the collection, storage, analysis, and manipulation of chemical data. The chemical data of interest typically includes information on small molecule formulas, structures, properties, spectra, and activities (biological or industrial). Cheminformatics originally emerged as a vehicle to help the drug discovery and development process, however cheminformatics now plays an increasingly important role in many areas of biology, chemistry, and biochemistry...
2016: Current Protocols in Bioinformatics
Mathieu Lavallée-Adam, John R Yates
PSEA-Quant analyzes quantitative mass spectrometry-based proteomics datasets to identify enrichments of annotations contained in repositories such as the Gene Ontology and Molecular Signature databases. It allows users to identify the annotations that are significantly enriched for reproducibly quantified high abundance proteins. PSEA-Quant is available on the Web and as a command-line tool. It is compatible with all label-free and isotopic labeling-based quantitative proteomics methods. This protocol describes how to use PSEA-Quant and interpret its output...
2016: Current Protocols in Bioinformatics
Sangya Pundir, Maria J Martin, Claire O'Donovan
The Universal Protein Resource (UniProt) is a comprehensive resource for protein sequence and annotation data (UniProt Consortium, 2015). The UniProt Web site receives ∼400,000 unique visitors per month and is the primary means to access UniProt. Along with various datasets that you can search, UniProt provides three main tools. These are the 'BLAST' tool for sequence similarity searching, the 'Align' tool for multiple sequence alignment, and the 'Retrieve/ID Mapping' tool for using a list of identifiers to retrieve UniProtKB proteins and to convert database identifiers from UniProt to external databases or vice versa...
2016: Current Protocols in Bioinformatics
Jianyi Yang, Yang Zhang
I-TASSER is a hierarchical protocol for automated protein structure prediction and structure-based function annotation. Starting from the amino acid sequence of target proteins, I-TASSER first generates full-length atomic structural models from multiple threading alignments and iterative structural assembly simulations followed by atomic-level structure refinement. The biological functions of the protein, including ligand-binding sites, enzyme commission number, and gene ontology terms, are then inferred from known protein function databases based on sequence and structure profile comparisons...
2015: Current Protocols in Bioinformatics
Michael S Campbell, Mark Yandell
Genome projects have evolved from large international undertakings to tractable endeavors for a single lab. Accurate genome annotation is critical for successful genomic, genetic, and molecular biology experiments. These annotations can be generated using a number of approaches and available software tools. This unit describes methods for genome annotation and a number of software tools commonly used in gene annotation.
2015: Current Protocols in Bioinformatics
Annelien Verfaillie, Hana Imrichova, Rekins Janky, Stein Aerts
Gene expression profiling is often used to identify genes that are co-expressed in a biological process or disease. Downstream analyses of co-expressed gene sets using bioinformatics methods can reveal candidate transcription factors (TF) that co-regulate these genes, based on the presence of shared TF binding sites. Drawing gene regulatory networks that connect TFs to their predicted target genes can uncover gene modules that implement a particular function. Here, we describe several protocols to analyze any set of co-expressed genes using iRegulon and i-cisTarget...
2015: Current Protocols in Bioinformatics
Andy Menzies, Jon W Teague, Adam P Butler, Helen Davies, Patrick Tarpey, Serena Nik-Zainal, Peter J Campbell
VAGrENT is a tool that provides biological context and effect prediction for genomic sequence variants. It annotates single base substitutions and small insertions and deletions by comparing them to reference information within or close to genes or other transcribed elements. This information provides the critical insight required to inform the biological or clinical significance of variant data generated from sequencing studies. The software has been optimized to run efficiently against the large numbers and diverse classes of variants that are typically generated from next generation sequencing technologies...
2015: Current Protocols in Bioinformatics
Keiran M Raine, Jonathan Hinton, Adam P Butler, Jon W Teague, Helen Davies, Patrick Tarpey, Serena Nik-Zainal, Peter J Campbell
cgpPindel is a modified version of Pindel that is optimized for detecting somatic insertions and deletions (indels) in cancer genomes and other samples compared to a reference control. Post-hoc filters remove false positive calls, resulting in a high-quality dataset for downstream analysis. This unit provides concise instructions for both a simple 'one-shot' execution of cgpPindel and a more detailed approach suitable for large-scale compute farms.
2015: Current Protocols in Bioinformatics
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"