Roy A Hall, Helle Bielefeldt-Ohmann, Breeanna J McLean, Caitlin A O'Brien, Agathe M G Colmant, Thisun B H Piyasena, Jessica J Harrison, Natalee D Newton, Ross T Barnard, Natalie A Prow, Joshua M Deerain, Marcus G K Y Mah, Jody Hobson-Peters
Recent advances in virus detection strategies and deep sequencing technologies have enabled the identification of a multitude of new viruses that persistently infect mosquitoes but do not infect vertebrates. These are usually referred to as insect-specific viruses (ISVs). These novel viruses have generated considerable interest in their modes of transmission, persistence in mosquito populations, the mechanisms that restrict their host range to mosquitoes, and their interactions with pathogens transmissible by the same mosquito...
Germán Retamosa, Luis de Pedro, Ivan González, Javier Tamames
Homology detection has evolved over the time from heavy algorithms based on dynamic programming approaches to lightweight alternatives based on different heuristic models. However, the main problem with these algorithms is that they use complex statistical models, which makes it difficult to achieve a relevant speedup and find exact matches with the original results. Thus, their acceleration is essential. The aim of this article was to prefilter a sequence database. To make this work, we have implemented a groundbreaking heuristic model based on NVIDIA's graphics processing units (GPUs) and multicore processors...
Yong Wang, Xia-Fang Tao, Zhi-Xi Su, A-Ke Liu, Tian-Lei Liu, Ling Sun, Qin Yao, Ke-Ping Chen, Xun Gu
Since the proposition of introns-early hypothesis, although many studies have shown that most eukaryotic ancestors possessed intron-rich genomes, evidence of intron existence in genomes of ancestral bacteria has still been absent. While not a single intron has been found in all protein-coding genes of current bacteria, analyses on bacterial genes horizontally transferred into eukaryotes at ancient time may provide evidence of intron existence in bacterial ancestors. In this study, a bacterial gene encoding capsule biosynthesis protein CapI was found in the genome of sea anemone, Nematostella vectensis...
Nancy Arana-Daniel, Alberto A Gallegos, Carlos López-Franco, Alma Y Alanís, Jacob Morales, Adriana López-Franco
With the increasing power of computers, the amount of data that can be processed in small periods of time has grown exponentially, as has the importance of classifying large-scale data efficiently. Support vector machines have shown good results classifying large amounts of high-dimensional data, such as data generated by protein structure prediction, spam recognition, medical diagnosis, optical character recognition and text classification, etc. Most state of the art approaches for large-scale learning use traditional optimization methods, such as quadratic programming or gradient descent, which makes the use of evolutionary algorithms for training support vector machines an area to be explored...
Alvaro Chiner-Oms, Fernando González-Candelas
We present EvalMSA, a software tool for evaluating and detecting outliers in multiple sequence alignments (MSAs). This tool allows the identification of divergent sequences in MSAs by scoring the contribution of each row in the alignment to its quality using a sum-of-pair-based method and additional analyses. Our main goal is to provide users with objective data in order to take informed decisions about the relevance and/or pertinence of including/retaining a particular sequence in an MSA. EvalMSA is written in standard Perl and also uses some routines from the statistical language R...
Ingrid Thaís Beltrame-Botelho, Carlos Talavera-López, Björn Andersson, Edmundo Carlos Grisard, Patricia Hermes Stoco
Kinetoplastids are an ancestral group of protists that contains free-living species and parasites with distinct mechanisms in response to stress. Here, we compared genes involved in antioxidant defense (AD), proposing an evolution model among trypanosomatids. All genes were identified in Bodo saltans, suggesting that AD mechanisms have evolved prior to adaptation for parasitic lifestyles. While most of the monoxenous and dixenous parasites revealed minor differences from B. saltans, the endosymbiont-bearing species have an increased number of genes...
Young-Joo Seol, So Youn Won, Younhee Shin, Jong-Yeol Lee, Jong-Sik Chun, Yong-Kab Kim, Chang-Kug Kim
We developed a multilayered screening method that integrates both genome and transcriptome data to effectively identify regulatory genes in rice (Oryza sativa). We tested our method using eight rice accessions that differed in three important nutritional and agricultural traits, anthocyanin biosynthesis, amylose content, and heading date. In the genome resequencing of eight rice accessions with 24 RNA sequencing experiments, 98% of the preprocessed reads could be uniquely mapped to the reference genome, resulting in the identification of 42,699 unique transcripts...
José R Romero, Jessica A Carballido, Ingrid Garbus, Viviana C Echenique, Ignacio Ponzoni
The identification of nested motifs in genomic sequences is a complex computational problem. The detection of these patterns is important to allow the discovery of transposable element (TE) insertions, incomplete reverse transcripts, deletions, and/or mutations. In this study, a de novo strategy for detecting patterns that represent nested motifs was designed based on exhaustive searches for pairs of motifs and combinatorial pattern analysis. These patterns can be grouped into three categories, motifs within other motifs, motifs flanked by other motifs, and motifs of large size...
Rohit Kumar Yadav, Haider Banka
In bioinformatics, multiple sequence alignment (MSA) is an NP-hard problem. Hence, nature-inspired techniques can better approximate the solution. In the current study, a novel biogeography-based optimization (NBBO) is proposed to solve an MSA problem. The biogeography-based optimization (BBO) is a new paradigm for optimization. But, there exists some deficiencies in solving complicated problems such as low population diversity and slow convergence rate. NBBO is an enhanced version of BBO, in which, a new migration operation is proposed to overcome the limitations of BBO...
Jie Zhou, Pianyu Zhong, Tinghui Zhang
Determination of sequence similarity is one of the major steps in computational phylogenetic studies. One of the major tasks of computational biologists is to develop novel mathematical descriptors for similarity analysis. DNA clustering is an important technology that automatically identifies inherent relationships among large-scale DNA sequences. The comparison between the DNA sequences of different species helps determine phylogenetic relationships among species. Alignment-free approaches have continuously gained interest in various sequence analysis applications such as phylogenetic inference and metagenomic classification/clustering, particularly for large-scale sequence datasets...
Carlos Montemuiño, Antonio Espinosa, Juan C Moure, Gonzalo Vera, Porfidio Hernández, Sebastián Ramos-Onsins
The msParSm application is an evolution of msPar, the parallel version of the coalescent simulation program ms, which removes the limitation for simulating long stretches of DNA sequences with large recombination rates, without compromising the accuracy of the standard coalescence. This work introduces msParSm, describes its significant performance improvements over msPar and its shared memory parallelization details, and shows how it can get better, if not similar, execution times than MaCS. Two case studies with different mutation rates were analyzed, one approximating the human average and the other approximating the Drosophila melanogaster average...
Horng-Yunn Dou, Yih-Yuan Chen, Ying-Tsong Chen, Jia-Ru Chang, Chien-Hsing Lin, Keh-Ming Wu, Ming-Shian Lin, Ih-Jen Su, Shih-Feng Tsai
To better understand the transmission and evolution of Mycobacterium tuberculosis (MTB) in Taiwan, six different MTB isolates (representatives of the Beijing ancient sublineage, Beijing modern sublineage, Haarlem, East-African Indian, T1, and Latin-American Mediterranean (LAM)) were characterized and their genomes were sequenced. Discriminating among large sequence polymorphisms (LSPs) that occur once versus those that occur repeatedly in a genomic region may help to elucidate the biological roles of LSPs and to identify the useful phylogenetic relationships...
Alyssa T Pyke, David Warrilow
Historically, classifications of arboviruses were based on serological techniques. Hence, collections of arbovirus isolates have been central to this process by providing the antigenic reagents for these methods. However, with increasing concern about biosafety and security, the introduction of molecular biology techniques has led to greater emphasis on the storage of nucleic acid sequence data over the maintenance of archival material. In this commentary, we provide examples of where archival collections provide an important source of genetic material to assist in confirming the authenticity of reference strains and vaccine stocks, to clarify taxonomic relationships particularly when isolates of the same virus species have been collected across a wide expanse of time and space, for future phenotypic analysis, to determine the historical diversity of strains, and to understand the mechanisms leading to changes in genome structure and virus evolution...
Lesley Bell-Sakyi, Houssam Attoui
While ticks have been known to harbor and transmit pathogenic arboviruses for over 80 years, the application of high-throughput sequencing technologies has revealed that ticks also appear to harbor a diverse range of endogenous tick-only viruses belonging to many different families. Almost nothing is known about these viruses; indeed, it is unclear in most cases whether the identified viral sequences are derived from actual replication-competent viruses or from endogenous virus elements incorporated into the ticks' genomes...
Shaik Naseer Pasha, Iyer Meenakshi, Ramanathan Sowdhamini
Myosins are actin-based motor proteins involved in many cellular movements. It is interesting to study the evolutionary patterns and the functional attributes of various types of myosins. Computational search algorithms were performed to identify putative myosin members by phylogenetic analysis, sequence motifs, and coexisting domains. This study is aimed at understanding the distribution and the likely biological functions of myosins encoded in various taxa and available eukaryotic genomes. We report here a phylogenetic analysis of around 4,064 myosin motor domains, built entirely from complete or near-complete myosin repertoires incorporating many unclassified, uncharacterized sequences and new myosin classes, with emphasis on myosins from Fungi, Haptophyta, and other Stramenopiles, Alveolates, and Rhizaria (SAR)...
Jeong-Ho Baek, Junah Kim, Chang-Kug Kim, Seong-Han Sohn, Dongsu Choi, Milind B Ratnaparkhe, Do-Wan Kim, Tae-Ho Lee
Information on multiple synteny between plants and/or within a plant is key information to understand genome evolution. In addition, visualization of multiple synteny is helpful in interpreting evolution. So far, some web applications have been developed to determine and visualize multiple homology regions at once. However, the applications are not fully convenient for biologists because some of them do not include the function of synteny determination but visualize the multiple synteny plots by allowing users to upload their synteny data by determining the synteny based only on BLAST similarity information, with some algorithms not designed for synteny determination...
Anna M Stenkova, Evgeniya P Bystritskaya, Konstantin V Guzev, Alexander V Rakin, Marina P Isaeva
The genus Yersinia includes species with a wide range of eukaryotic hosts (from fish, insects, and plants to mammals and humans). One of the major outer membrane proteins, the porin OmpC, is preferentially expressed in the host gut, where osmotic pressure, temperature, and the concentrations of nutrients and toxic products are relatively high. We consider here the molecular evolution and phylogeny of Yersinia ompC. The maximum likelihood gene tree reflects the macroevolution processes occurring within the genus Yersinia...
Jun Wang, Ying Wang, Weidong Gu, Buqing Ni, Haoliang Sun, Tong Yu, Wanjun Gu, Liang Chen, Yongfeng Shao
RNA sequencing (RNA-seq) has revolutionary roles in transcriptome identification and quantification of different types of tissues and cells in many organisms. Although numerous RNA-seq data derived from many types of human tissues and cell lines, little is known on the transcriptome repertoire of human aortic valve. In this study, we sequenced the total RNA prepared from two calcified human aortic valves and reported the whole transcriptome of human aortic valve. Integrating RNA-seq data of 13 human tissues from Human Body Map 2 Project, we constructed a transcriptome repertoire of human tissues, including 19,505 protein-coding genes and 4,948 long intergenic noncoding RNAs (lincRNAs)...
Alexander Gamisch
The Binary State Speciation and Extinction (BiSSE) method is one of the most popular tools for investigating the rates of diversification and character evolution. Yet, based on previous simulation studies, it is commonly held that the BiSSE method requires phylogenetic trees of fairly large sample sizes (>300 taxa) in order to distinguish between the different models of speciation, extinction, or transition rate asymmetry. Here, the power of the BiSSE method is reevaluated by simulating trees of both small and large sample sizes (30, 60, 90, and 300 taxa) under various asymmetry models and root state assumptions...
Yong Chen, Dandan Geng, Kristina Ehrhardt, Shaoqiang Zhang
Grouping genes as operons is an important genomic feature of prokaryotic organisms. The comprehensive understanding of the operon organizations would be helpful to decipher transcriptional mechanisms, cellular pathways, and the evolutionary landscape of prokaryotic genomes. Although thousands of prokaryotes have been sequenced, genome-wide investigation of the evolutionary dynamics (division and recombination) of operons among these genomes remains unexplored. Here, we systematically analyzed the operon dynamics of Rhodococcus jostii RHA1 (RHA1), an oleaginous bacterium with high potential applications in biofuel, by comparing 340 prokaryotic genomes that were carefully selected from different genera...
