Read by QxMD icon Read

Algorithms for Molecular Biology: AMB

Broňa Brejová, Askar Gafurov, Dana Pardubská, Michal Sabo, Tomáš Vinař
BACKGROUND: Isometric gene tree reconciliation is a gene tree/species tree reconciliation problem where both the gene tree and the species tree include branch lengths, and these branch lengths must be respected by the reconciliation. The problem was introduced by Ma et al. in 2008 in the context of reconstructing evolutionary histories of genomes in the infinite sites model. RESULTS: In this paper, we show that the original algorithm by Ma et al. is incorrect, and we propose a modified algorithm that addresses the problems that we discovered...
2017: Algorithms for Molecular Biology: AMB
Thomas Tschager, Simon Rösch, Ludovic Gillet, Peter Widmayer
BACKGROUND: Given a peptide as a string of amino acids, the masses of all its prefixes and suffixes can be found by a trivial linear scan through the amino acid masses. The inverse problem is the idealde novopeptide sequencing problem: Given all prefix and suffix masses, determine the string of amino acids. In biological reality, the given masses are measured in a lab experiment, and measurements by necessity are noisy. The (real, noisy) de novo peptide sequencing problem therefore has a noisy input: a few of the prefix and suffix masses of the peptide are missing and a few other masses are given in addition...
2017: Algorithms for Molecular Biology: AMB
Guillaume Fertin, Géraldine Jean, Eric Tannier
BACKGROUND: Combinatorial works on genome rearrangements have so far ignored the influence of intergene sizes, i.e. the number of nucleotides between consecutive genes, although it was recently shown decisive for the accuracy of inference methods (Biller et al. in Genome Biol Evol 8:1427-39, 2016; Biller et al. in Beckmann A, Bienvenu L, Jonoska N, editors. Proceedings of Pursuit of the Universal-12th conference on computability in Europe, CiE 2016, Lecture notes in computer science, vol 9709, Paris, France, June 27-July 1, 2016...
2017: Algorithms for Molecular Biology: AMB
Sven Jager, Benjamin Schiller, Philipp Babel, Malte Blumenroth, Thorsten Strufe, Kay Hamacher
BACKGROUND: In this work, we present a new coarse grained representation of RNA dynamics. It is based on adjacency matrices and their interactions patterns obtained from molecular dynamics simulations. RNA molecules are well-suited for this representation due to their composition which is mainly modular and assessable by the secondary structure alone. These interactions can be represented as adjacency matrices of k nucleotides. Based on those, we define transitions between states as changes in the adjacency matrices which form Markovian dynamics...
2017: Algorithms for Molecular Biology: AMB
Daniel Doerr, Metin Balaban, Pedro Feijão, Cedric Chauve
BACKGROUND: The gene family-free framework for comparative genomics aims at providing methods for gene order analysis that do not require prior gene family assignment, but work directly on a sequence similarity graph. We study two problems related to the breakpoint median of three genomes, which asks for the construction of a fourth genome that minimizes the sum of breakpoint distances to the input genomes. METHODS: We present a model for constructing a median of three genomes in this family-free setting, based on maximizing an objective function that generalizes the classical breakpoint distance by integrating sequence similarity in the score of a gene adjacency...
2017: Algorithms for Molecular Biology: AMB
Mohammed El-Kebir, Benjamin J Raphael, Ron Shamir, Roded Sharan, Simone Zaccaria, Meirav Zehavi, Ron Zeira
BACKGROUND: Cancer is an evolutionary process characterized by the accumulation of somatic mutations in a population of cells that form a tumor. One frequent type of mutations is copy number aberrations, which alter the number of copies of genomic regions. The number of copies of each position along a chromosome constitutes the chromosome's copy-number profile. Understanding how such profiles evolve in cancer can assist in both diagnosis and prognosis. RESULTS: We model the evolution of a tumor by segmental deletions and amplifications, and gauge distance from profile [Formula: see text] to [Formula: see text] by the minimum number of events needed to transform [Formula: see text] into [Formula: see text]...
2017: Algorithms for Molecular Biology: AMB
Dan DeBlasio, John Kececioglu
BACKGROUND: In a computed protein multiple sequence alignment, the coreness of a column is the fraction of its substitutions that are in so-called core columns of the gold-standard reference alignment of its proteins. In benchmark suites of protein reference alignments, the core columns of the reference alignment are those that can be confidently labeled as correct, usually due to all residues in the column being sufficiently close in the spatial superposition of the known three-dimensional structures of the proteins...
2017: Algorithms for Molecular Biology: AMB
Safa Jammali, Esaie Kuitche, Ayoub Rachati, François Bélanger, Michelle Scott, Aïda Ouangraoua
BACKGROUND: Frameshift translation is an important phenomenon that contributes to the appearance of novel coding DNA sequences (CDS) and functions in gene evolution, by allowing alternative amino acid translations of gene coding regions. Frameshift translations can be identified by aligning two CDS, from a same gene or from homologous genes, while accounting for their codon structure. Two main classes of algorithms have been proposed to solve the problem of aligning CDS, either by amino acid sequence alignment back-translation, or by simultaneously accounting for the nucleotide and amino acid levels...
2017: Algorithms for Molecular Biology: AMB
Marius Erbert, Steffen Rechner, Matthias Müller-Hannemann
BACKGROUND: A basic task in bioinformatics is the counting of k-mers in genome sequences. Existing k-mer counting tools are most often optimized for small k < 32 and suffer from excessive memory resource consumption or degrading performance for large k. However, given the technology trend towards long reads of next-generation sequencers, support for large k becomes increasingly important. RESULTS: We present the open source k-mer counting software Gerbil that has been designed for the efficient counting of k-mers for k ≥ 32...
2017: Algorithms for Molecular Biology: AMB
Raghuram Thiagarajan, Amir Alavi, Jagdeep T Podichetty, Jason N Bazil, Daniel A Beard
Systems research spanning fields from biology to finance involves the identification of models to represent the underpinnings of complex systems. Formal approaches for data-driven identification of network interactions include statistical inference-based approaches and methods to identify dynamical systems models that are capable of fitting multivariate data. Availability of large data sets and so-called 'big data' applications in biology present great opportunities as well as major challenges for systems identification/reverse engineering applications...
2017: Algorithms for Molecular Biology: AMB
Yun Deng, David Fernández-Baca
BACKGROUND: Semi-labeled trees generalize ordinary phylogenetic trees, allowing internal nodes to be labeled by higher-order taxa. Taxonomies are examples of semi-labeled trees. Suppose we are given collection [Formula: see text] of semi-labeled trees over various subsets of a set of taxa. The ancestral compatibility problem asks whether there is a semi-labeled tree that respects the clusterings and the ancestor/descendant relationships implied by the trees in [Formula: see text]. The running time and space usage of the best previous algorithm for testing ancestral compatibility depend on the degrees of the nodes in the trees in [Formula: see text]...
2017: Algorithms for Molecular Biology: AMB
Daniel Bork, Ricson Cheng, Jincheng Wang, Jean Sung, Ran Libeskind-Hadas
BACKGROUND: Phylogenetic tree reconciliation is a widely-used method for inferring the evolutionary histories of genes and species. In the duplication-loss-coalescence (DLC) model, we seek a reconciliation that explains the incongruence between a gene and species tree using gene duplication, loss, and deep coalescence events. In the maximum parsimony framework, costs are associated with these event types and a reconciliation is sought that minimizes the total cost of the events required to map the gene tree onto the species tree...
2017: Algorithms for Molecular Biology: AMB
Yannis Almirantis, Panagiotis Charalampopoulos, Jia Gao, Costas S Iliopoulos, Manal Mohamed, Solon P Pissis, Dimitris Polychronopoulos
BACKGROUND: The deviation of the observed frequency of a word w from its expected frequency in a given sequence x is used to determine whether or not the word is avoided. This concept is particularly useful in DNA linguistic analysis. The value of the deviation of w, denoted by [Formula: see text], effectively characterises the extent of a word by its edge contrast in the context in which it occurs. A word w of length [Formula: see text] is a [Formula: see text]-avoided word in x if [Formula: see text], for a given threshold [Formula: see text]...
2017: Algorithms for Molecular Biology: AMB
Riccardo Dondi, Manuel Lafond, Nadia El-Mabrouk
BACKGROUND: Given a gene family, the relations between genes (orthology/paralogy), are represented by a relation graph, where edges connect pairs of orthologous genes and "missing" edges represent paralogs. While a gene tree directly induces a relation graph, the converse is not always true. Indeed, a relation graph is not necessarily "satisfiable", i.e. does not necessarily correspond to a gene tree. And even if that holds, it may not be "consistent", i.e. the tree may not represent a true history in agreement with a species tree...
2017: Algorithms for Molecular Biology: AMB
Diego P Rubert, Pedro Feijão, Marília Dias Vieira Braga, Jens Stoye, Fábio Henrique Viduani Martinez
BACKGROUND: Rearrangements are large-scale mutations in genomes, responsible for complex changes and structural variations. Most rearrangements that modify the organization of a genome can be represented by the double cut and join (DCJ) operation. Given two balanced genomes, i.e., two genomes that have exactly the same number of occurrences of each gene in each genome, we are interested in the problem of computing the rearrangement distance between them, i.e., finding the minimum number of DCJ operations that transform one genome into the other...
2017: Algorithms for Molecular Biology: AMB
Laurent Noé
BACKGROUND: Spaced seeds, also named gapped q-grams, gapped k-mers, spaced q-grams, have been proven to be more sensitive than contiguous seeds (contiguous q-grams, contiguous k-mers) in nucleic and amino-acid sequences analysis. Initially proposed to detect sequence similarities and to anchor sequence alignments, spaced seeds have more recently been applied in several alignment-free related methods. Unfortunately, spaced seeds need to be initially designed. This task is known to be time-consuming due to the number of spaced seed candidates...
2017: Algorithms for Molecular Biology: AMB
Leandro Lima, Blerina Sinaimeri, Gustavo Sacomoto, Helene Lopez-Maestre, Camille Marchet, Vincent Miele, Marie-France Sagot, Vincent Lacroix
BACKGROUND: The main challenge in de novo genome assembly of DNA-seq data is certainly to deal with repeats that are longer than the reads. In de novo transcriptome assembly of RNA-seq reads, on the other hand, this problem has been underestimated so far. Even though we have fewer and shorter repeated sequences in transcriptomics, they do create ambiguities and confuse assemblers if not addressed properly. Most transcriptome assemblers of short reads are based on de Bruijn graphs (DBG) and have no clear and explicit model for repeats in RNA-seq data, relying instead on heuristics to deal with them...
2017: Algorithms for Molecular Biology: AMB
Timo Beller, Enno Ohlebusch
[This corrects the article DOI: 10.1186/s13015-016-0083-7.].
2016: Algorithms for Molecular Biology: AMB
S Srivastava, S B Lal, D C Mishra, U B Angadi, K K Chaturvedi, S N Rai, A Rai
BACKGROUND: Protein structure comparison play important role in in silico functional prediction of a new protein. It is also used for understanding the evolutionary relationships among proteins. A variety of methods have been proposed in literature for comparing protein structures but they have their own limitations in terms of accuracy and complexity with respect to computational time and space. There is a need to improve the computational complexity in comparison/alignment of proteins through incorporation of important biological and structural properties in the existing techniques...
2016: Algorithms for Molecular Biology: AMB
Jun Zhou, Yu Lin, Vaibhav Rajan, William Hoskins, Bing Feng, Jijun Tang
BACKGOUND: Evolution of cancer cells is characterized by large scale and rapid changes in the chromosomal  landscape. The fluorescence in situ hybridization (FISH) technique provides a way to measure the copy numbers of preselected genes in a group of cells and has been found to be a reliable source of data to model the evolution of tumor cells. Chowdhury et al. (Bioinformatics 29(13):189-98, 23; PLoS Comput Biol 10(7):1003740, 24) recently develop a computational model for tumor progression driven by gains and losses in cell count patterns obtained by FISH probes...
2016: Algorithms for Molecular Biology: AMB
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"