Read by QxMD icon Read

Whole genome alignment

Peter A Larsen, R Alan Harris, Yue Liu, Shwetha C Murali, C Ryan Campbell, Adam D Brown, Beth A Sullivan, Jennifer Shelton, Susan J Brown, Muthuswamy Raveendran, Olga Dudchenko, Ido Machol, Neva C Durand, Muhammad S Shamim, Erez Lieberman Aiden, Donna M Muzny, Richard A Gibbs, Anne D Yoder, Jeffrey Rogers, Kim C Worley
BACKGROUND: The de novo assembly of repeat-rich mammalian genomes using only high-throughput short read sequencing data typically results in highly fragmented genome assemblies that limit downstream applications. Here, we present an iterative approach to hybrid de novo genome assembly that incorporates datasets stemming from multiple genomic technologies and methods. We used this approach to improve the gray mouse lemur (Microcebus murinus) genome from early draft status to a near chromosome-scale assembly...
November 16, 2017: BMC Biology
Christopher A Saski, Brian E Scheffler, Amanda M Hulse-Kemp, Bo Liu, Qingxin Song, Atsumi Ando, David M Stelly, Jodi A Scheffler, Jane Grimwood, Don C Jones, Daniel G Peterson, Jeremy Schmutz, Z Jeffery Chen
Like those of many agricultural crops, the cultivated cotton is an allotetraploid and has a large genome (~2.5 gigabase pairs). The two sub genomes, A and D, are highly similar but unequally sized and repeat-rich, which pose significant challenges for accurate genome reconstruction using standard approaches. Here we report the development of BAC libraries, sub genome specific physical maps, and a new-generation sequencing approach that will lead to a reference-grade genome assembly for Upland cotton. Three BAC libraries were constructed, fingerprinted, and integrated with BAC-end sequences (BES) to produce a de novo whole-genome physical map...
November 10, 2017: Scientific Reports
Daria A Andreyushkova, Alexey I Makunin, Violetta R Beklemisheva, Svetlana A Romanenko, Anna S Druzhkova, Larisa B Biltueva, Natalya A Serdyukova, Alexander S Graphodatsky, Vladimir A Trifonov
Several whole genome duplication (WGD) events followed by rediploidization took place in the evolutionary history of vertebrates. Acipenserids represent a convenient model group for investigation of the consequences of WGD as their representatives underwent additional WGD events in different lineages resulting in ploidy level variation between species, and these processes are still ongoing. Earlier, we obtained a set of sterlet (Acipenser ruthenus) chromosome-specific libraries by microdissection and revealed that they painted two or four pairs of whole sterlet chromosomes, as well as additional chromosomal regions, depending on rediploidization status and chromosomal rearrangements after genome duplication...
November 10, 2017: Genes
Bing Feng, Yu Lin, Lingxi Zhou, Yan Guo, Robert Friedman, Ruofan Xia, Fei Hu, Chao Liu, Jijun Tang
Phylogenetic studies aim to discover evolutionary relationships and histories. These studies are based on similarities of morphological characters and molecular sequences. Currently, widely accepted phylogenetic approaches are based on multiple sequence alignments, which analyze shared gene datasets and concatenate/coalesce these results to a final phylogeny with maximum support. However, these approaches still have limitations, and often have conflicting results with each other. Reconstructing ancestral genomes helps us understand mechanisms and corresponding consequences of evolution...
November 9, 2017: Scientific Reports
Isabelle Jupin, Maya Ayach, Lucile Jomat, Sonia Fieulaine, Stéphane Bressanelli
The positive-strand RNA virus Turnip yellow mosaic virus (TYMV) encodes an ovarian tumor (OTU)-like protease/deubiquitinase (PRO/DUB) protein domain involved both in proteolytic processing of the viral polyprotein through its PRO activity, and in removal of ubiquitin chains from ubiquitylated substrates through its DUB activity. Here, the crystal structures of TYMV PRO/DUB mutants and molecular dynamics simulations reveal that an idiosyncratic mobile loop participates in reversibly constricting its unusual catalytic site by adopting "open", "intermediate" or "closed" conformations...
November 8, 2017: PLoS Pathogens
Kazukuni Hayashi, Hans-Jürgen Busse, Jan Golke, James Anderson, Xuehua Wan, Shaobin Hou, Patrick S G Chain, Rebecca D Prescott, Stuart P Donachie
A Gram-negative, rod-shaped bacterium, designated KH87(T), was isolated from a fishing hook that had been baited and suspended in seawater off O'ahu, Hawai'i. Based on a comparison of 1524 nt of the 16S rRNA gene sequence of strain KH87(T), its nearest neighbours were the GammaproteobacteriaRheinheimera nanhaiensis E407-8(T) (96.2 % identity), Rheinheimera chironomi K19414(T) (96.0 %), Rheinheimera pacifica KMM 1406(T) (95.8 %), Rheinheimera muenzenbergensis E49(T) (95.7 %), Alishewanella solinquinati KMK6(T) (94...
November 7, 2017: International Journal of Systematic and Evolutionary Microbiology
Altti Ilari Maarala, Zurab Bzhalava, Joakim Dillner, Keijo Heljanko, Davit Bzhalava
Motivation: Next Generation Sequencing (NGS) technology enables identification of microbial genomes from massive amount of human microbiomes more rapidly and cheaper than ever before. However, the traditional sequential genome analysis algorithms, tools, and platforms are inefficient for performing large-scale metagenomic studies on ever-growing sample data volumes. Currently, there is an urgent need for scalable analysis pipelines that enable harnessing all the power of parallel computation in computing clusters and in cloud computing environments...
November 2, 2017: Bioinformatics
Haibao Tang, Ewen F Kirkness, Christoph Lippert, William H Biggs, Martin Fabani, Ernesto Guzman, Smriti Ramakrishnan, Victor Lavrenko, Boyko Kakaradov, Claire Hou, Barry Hicks, David Heckerman, Franz J Och, C Thomas Caskey, J Craig Venter, Amalio Telenti
Short tandem repeats (STRs) are hyper-mutable sequences in the human genome. They are often used in forensics and population genetics and are also the underlying cause of many genetic diseases. There are challenges associated with accurately determining the length polymorphism of STR loci in the genome by next-generation sequencing (NGS). In particular, accurate detection of pathological STR expansion is limited by the sequence read length during whole-genome analysis. We developed TREDPARSE, a software package that incorporates various cues from read alignment and paired-end distance distribution, as well as a sequence stutter model, in a probabilistic framework to infer repeat sizes for genetic loci, and we used this software to infer repeat sizes for 30 known disease loci...
November 2, 2017: American Journal of Human Genetics
Dei M Elurbe, Sarita S Paranjpe, Georgios Georgiou, Ila van Kruijsbergen, Ozren Bogdanovic, Romain Gibeaux, Rebecca Heald, Ryan Lister, Martijn A Huynen, Simon J van Heeringen, Gert Jan C Veenstra
BACKGROUND: Genome duplication has played a pivotal role in the evolution of many eukaryotic lineages, including the vertebrates. A relatively recent vertebrate genome duplication is that in Xenopus laevis, which resulted from the hybridization of two closely related species about 17 million years ago. However, little is known about the consequences of this duplication at the level of the genome, the epigenome, and gene expression. RESULTS: The X. laevis genome consists of two subgenomes, referred to as L (long chromosomes) and S (short chromosomes), that originated from distinct diploid progenitors...
October 24, 2017: Genome Biology
Minako Hijikata, Naoto Keicho, Le Van Duc, Shinji Maeda, Nguyen Thi Le Hang, Ikumi Matsushita, Seiya Kato
BACKGROUND: Spacer oligonucleotide typing (spoligotyping), a widely used, classical genotyping method for Mycobacterium tuberculosis complex (MTBC), is a PCR-based dot-blot hybridization technique to detect the genetic diversity of the direct repeat (DR) region. Of the seven major MTBC lineages in the world, lineage 1 (Indo-Oceanic) mostly corresponds to the East African-Indian (EAI) spoligotype family in East Africa and Southeast Asia. OBJECTIVES: We investigated the genomic features of Vietnamese lineage 1 strains, comparing spoligotype patterns using whole-genome sequencing (WGS) data...
2017: PloS One
Brent S Pedersen, Ryan L Collins, Michael E Talkowski, Aaron R Quinlan
The BAM1 and CRAM2 formats provide a supplementary linear index that facilitates rapid access to sequence alignments in arbitrary genomic regions. Comparing consecutive entries in a BAM or CRAM index allows one to infer the number of alignment records per genomic region for use as an effective proxy of sequence depth in each genomic region. Based on these properties, we have developed indexcov, an efficient estimator of whole-genome sequencing coverage to rapidly identify samples with aberrant coverage profiles, reveal large scale chromosomal anomalies, recognize potential batch effects, and infer the sex of a sample...
September 18, 2017: GigaScience
Sean D Smith, Joseph K Kawash, Andrey Grigoriev
Current human whole genome sequencing projects produce massive amounts of data, often creating significant computational challenges. Different approaches have been developed for each type of genome variant and method of its detection, necessitating users to run multiple algorithms to find variants. We present Genome Rearrangement OmniMapper (GROM), a novel comprehensive variant detection algorithm accepting aligned read files as input and finding SNVs, indels, structural variants (SVs), and copy number variants (CNVs)...
October 1, 2017: GigaScience
Yufei Xu, Shouqing Sun, Niu Li, Tingting Yu, Xiumin Wang, Jian Wang, Nan Bao
Syndromic craniosynostosis is a group of multiple conditions with high heterogeneity, and many rare syndromes still remain to be characterized. To identify and analyze causative genetic variants in nine unrelated probands mainly manifested as syndromic craniosynostosis, we reviewed the relevant medical information of the patients and performed the whole exome sequencing, further verified with Sanger sequencing and parental background. Bioinformatics analysis was used to evaluate the potential deleterious or benign effect of each genetic variant through evolutionary conservation alignment, multi-lines of computer predication and the allele frequency in population dataset (control and patient)...
October 13, 2017: Gene
Vikas Bansal
Motivation: The short read lengths of current high-throughput sequencing technologies limit the ability to recover long-range haplotype information. Dilution pool methods for preparing DNA sequencing libraries from high molecular weight DNA fragments enable the recovery of long DNA fragments from short sequence reads. These approaches require computational methods for identifying the DNA fragments using aligned sequence reads and assembling the fragments into long haplotypes. Although a number of computational methods have been developed for haplotype assembly, the problem of identifying DNA fragments from dilution pool sequence data has not received much attention...
July 11, 2017: Bioinformatics
Erin K Molloy, Tandy Warnow
With the increasing availability of whole genome data, many species trees are being constructed from hundreds to thousands of loci. Although concatenation analysis using maximum likelihood is a standard approach for estimating species trees, it does not account for gene tree heterogeneity, which can occur due to many biological processes, such as incomplete lineage sorting. Coalescent species tree estimation methods, many of which are statistically consistent in the presence of incomplete lineage sorting, include Bayesian methods that co-estimate the gene trees and the species tree, summary methods that compute the species tree by combining estimated gene trees, and site-based methods that infer the species tree from site patterns in the alignments of different loci...
September 15, 2017: Systematic Biology
Sergii Ivakhno, Eric Roller, Camilla Colombo, Philip Tedder, Anthony J Cox
Motivation: Whole genome sequencing is becoming a diagnostics of choice for the identification of rare inherited and de novo copy number variants in families with various pediatric and late-onset genetic diseases. However, joint variant calling in pedigrees is hampered by the complexity of consensus breakpoint alignment across samples within an arbitrary pedigree structure. Results: We have developed a new tool, Canvas SPW, for the identification of inherited and de novo copy number variants from pedigree sequencing data...
September 27, 2017: Bioinformatics
Zhengrong Zhang, Li Yuan, Xin Liu, Xuesen Chen, Xiaoyun Wang
As a family of transcription factors, DNA binding with one figure (Dof) proteins play important roles in various biological processes in plants. Here, a total of 60 putative apple (Malus domestica) Dof genes (MdDof) were identified and mapped to different chromosomes. Chromosomal distribution and synteny analysis indicated that the expansion of the MdDof genes came primarily from segmental and duplication events, and from whole genome duplication, which lead to more Dof members in apples than in other plants...
October 3, 2017: Gene
Marvin N Wright, Damian Gola, Andreas Ziegler
The advancement of high-throughput sequencing technologies enables sequencing of human genomes at steadily decreasing costs and increasing quality. Before variants can be analyzed, e.g., in association studies, the raw data obtained from the sequencer need to be preprocessed. These preprocessing steps include the removal of adapters, duplicates, and contaminations, alignment to a reference genome and the postprocessing of the alignment. All later steps, such as variant discovery, rely on high data quality and proper preprocessing, emphasizing the great importance of quality control...
2017: Methods in Molecular Biology
Andrzej Zielezinski, Susana Vinga, Jonas Almeida, Wojciech M Karlowski
Alignment-free sequence analyses have been applied to problems ranging from whole-genome phylogeny to the classification of protein families, identification of horizontally transferred genes, and detection of recombined sequences. The strength of these methods makes them particularly useful for next-generation sequencing data processing and analysis. However, many researchers are unclear about how these methods work, how they compare to alignment-based methods, and what their potential is for use for their research...
October 3, 2017: Genome Biology
Kai Zhang, Zhiqiang Ruan, Jia Li, Chao Bian, Xinxin You, Steven L Coon, Qiong Shi
Melatonin is a multifunctional bioactive molecule that plays comprehensive physiological roles in all living organisms. N-acetylserotonin methyltransferase (ASMT, also known as hydroxyindole O-methyltransferase or HIOMT) is the final enzyme for biosynthesis of melatonin. Here, we performed a comparative genomic and transcriptomic survey to explore the ASMT family in fish. Two ASMT isotypes (ASMT1 and ASMT2) and a new ASMT-like (ASMTL) are all extracted from teleost genomes on the basis of phylogenetic and synteny analyses...
October 2, 2017: Molecules: a Journal of Synthetic Chemistry and Natural Product Chemistry
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"