journal
MENU ▼
Read by QxMD icon Read
search

Algorithms for Molecular Biology: AMB

journal
https://www.readbyqxmd.com/read/29881445/locus-aware-decomposition-of-gene-trees-with-respect-to-polytomous-species-trees
#1
Michał Aleksander Ciach, Anna Muszewska, Paweł Górecki
Background: Horizontal gene transfer (HGT), a process of acquisition and fixation of foreign genetic material, is an important biological phenomenon. Several approaches to HGT inference have been proposed. However, most of them either rely on approximate, non-phylogenetic methods or on the tree reconciliation, which is computationally intensive and sensitive to parameter values. Results: We investigate the locus tree inference problem as a possible alternative that combines the advantages of both approaches...
2018: Algorithms for Molecular Biology: AMB
https://www.readbyqxmd.com/read/29881444/a-fast-and-accurate-enumeration-based-algorithm-for-haplotyping-a-triploid-individual
#2
Jingli Wu, Qian Zhang
Background: Haplotype assembly, reconstructing haplotypes from sequence data, is one of the major computational problems in bioinformatics. Most of the current methodologies for haplotype assembly are designed for diploid individuals. In recent years, genomes having more than two sets of homologous chromosomes have attracted many research groups that are interested in the genomics of disease, phylogenetics, botany and evolution. However, there is still a lack of methods for reconstructing polyploid haplotypes...
2018: Algorithms for Molecular Biology: AMB
https://www.readbyqxmd.com/read/29755580/finding-local-genome-rearrangements
#3
Pijus Simonaitis, Krister M Swenson
Background: The double cut and join (DCJ) model of genome rearrangement is well studied due to its mathematical simplicity and power to account for the many events that transform gene order. These studies have mostly been devoted to the understanding of minimum length scenarios transforming one genome into another. In this paper we search instead for rearrangement scenarios that minimize the number of rearrangements whose breakpoints are unlikely due to some biological criteria. One such criterion has recently become accessible due to the advent of the Hi-C experiment, facilitating the study of 3D spacial distance between breakpoint regions...
2018: Algorithms for Molecular Biology: AMB
https://www.readbyqxmd.com/read/29588651/fsh-fast-spaced-seed-hashing-exploiting-adjacent-hashes
#4
Samuele Girotto, Matteo Comin, Cinzia Pizzi
Background: Patterns with wildcards in specified positions, namely spaced seeds , are increasingly used instead of k -mers in many bioinformatics applications that require indexing, querying and rapid similarity search, as they can provide better sensitivity. Many of these applications require to compute the hashing of each position in the input sequences with respect to the given spaced seed, or to multiple spaced seeds. While the hashing of k -mers can be rapidly computed by exploiting the large overlap between consecutive k -mers, spaced seeds hashing is usually computed from scratch for each position in the input sequence, thus resulting in slower processing...
2018: Algorithms for Molecular Biology: AMB
https://www.readbyqxmd.com/read/29588650/outlier-detection-in-blast-hits
#5
Nidhi Shah, Stephen F Altschul, Mihai Pop
Background: An important task in a metagenomic analysis is the assignment of taxonomic labels to sequences in a sample. Most widely used methods for taxonomy assignment compare a sequence in the sample to a database of known sequences. Many approaches use the best BLAST hit(s) to assign the taxonomic label. However, it is known that the best BLAST hit may not always correspond to the best taxonomic match. An alternative approach involves phylogenetic methods, which take into account alignments and a model of evolution in order to more accurately define the taxonomic origin of sequences...
2018: Algorithms for Molecular Biology: AMB
https://www.readbyqxmd.com/read/29568323/octal-optimal-completion-of-gene-trees-in-polynomial-time
#6
Sarah Christensen, Erin K Molloy, Pranjal Vachaspati, Tandy Warnow
Background: For a combination of reasons (including data generation protocols, approaches to taxon and gene sampling, and gene birth and loss), estimated gene trees are often incomplete, meaning that they do not contain all of the species of interest. As incomplete gene trees can impact downstream analyses, accurate completion of gene trees is desirable. Results: We introduce the Optimal Tree Completion problem , a general optimization problem that involves completing an unrooted binary tree (i...
2018: Algorithms for Molecular Biology: AMB
https://www.readbyqxmd.com/read/29467815/derivative-free-neural-network-for-optimizing-the-scoring-functions-associated-with-dynamic-programming-of-pairwise-profile-alignment
#7
Kazunori D Yamada
Background: A profile-comparison method with position-specific scoring matrix (PSSM) is among the most accurate alignment methods. Currently, cosine similarity and correlation coefficients are used as scoring functions of dynamic programming to calculate similarity between PSSMs. However, it is unclear whether these functions are optimal for profile alignment methods. By definition, these functions cannot capture nonlinear relationships between profiles. Therefore, we attempted to discover a novel scoring function, which was more suitable for the profile-comparison method than existing functions, using neural networks...
2018: Algorithms for Molecular Biology: AMB
https://www.readbyqxmd.com/read/29467814/fast-phylogenetic-inference-from-typing-data
#8
João A Carriço, Maxime Crochemore, Alexandre P Francisco, Solon P Pissis, Bruno Ribeiro-Gonçalves, Cátia Vaz
Background: Microbial typing methods are commonly used to study the relatedness of bacterial strains. Sequence-based typing methods are a gold standard for epidemiological surveillance due to the inherent portability of sequence and allelic profile data, fast analysis times and their capacity to create common nomenclatures for strains or clones. This led to development of several novel methods and several databases being made available for many microbial species. With the mainstream use of High Throughput Sequencing, the amount of data being accumulated in these databases is huge, storing thousands of different profiles...
2018: Algorithms for Molecular Biology: AMB
https://www.readbyqxmd.com/read/29445416/a-safe-and-complete-algorithm-for-metagenomic-assembly
#9
Nidia Obscura Acosta, Veli Mäkinen, Alexandru I Tomescu
Background: Reconstructing the genome of a species from short fragments is one of the oldest bioinformatics problems. Metagenomic assembly is a variant of the problem asking to reconstruct the circular genomes of all bacterial species present in a sequencing sample. This problem can be naturally formulated as finding a collection of circular walks of a directed graph G that together cover all nodes, or edges, of G . Approach: We address this problem with the "safe and complete" framework of Tomescu and Medvedev (Research in computational Molecular biology-20th annual conference, RECOMB 9649:152-163, 2016)...
2018: Algorithms for Molecular Biology: AMB
https://www.readbyqxmd.com/read/29441122/time-consistent-reconciliation-maps-and-forbidden-time-travel
#10
Nikolai Nøjgaard, Manuela Geiß, Daniel Merkle, Peter F Stadler, Nicolas Wieseke, Marc Hellmuth
Background: In the absence of horizontal gene transfer it is possible to reconstruct the history of gene families from empirically determined orthology relations, which are equivalent to event-labeled gene trees. Knowledge of the event labels considerably simplifies the problem of reconciling a gene tree T with a species trees S , relative to the reconciliation problem without prior knowledge of the event types. It is well-known that optimal reconciliations in the unlabeled case may violate time-consistency and thus are not biologically feasible...
2018: Algorithms for Molecular Biology: AMB
https://www.readbyqxmd.com/read/29387142/gene-tree-parsimony-for-incomplete-gene-trees-addressing-true-biological-loss
#11
Md Shamsuzzoha Bayzid, Tandy Warnow
Motivation: Species tree estimation from gene trees can be complicated by gene duplication and loss, and "gene tree parsimony" (GTP) is one approach for estimating species trees from multiple gene trees. In its standard formulation, the objective is to find a species tree that minimizes the total number of gene duplications and losses with respect to the input set of gene trees. Although much is known about GTP, little is known about how to treat inputs containing some incomplete gene trees (i...
2018: Algorithms for Molecular Biology: AMB
https://www.readbyqxmd.com/read/29238399/phylogeny-reconstruction-based-on-the-length-distribution-of-k-mismatch-common-substrings
#12
Burkhard Morgenstern, Svenja Schöbel, Chris-André Leimeister
Background: Various approaches to alignment-free sequence comparison are based on the length of exact or inexact word matches between pairs of input sequences. Haubold et al. (J Comput Biol 16:1487-1500, 2009) showed how the average number of substitutions per position between two DNA sequences can be estimated based on the average length of exact common substrings. Results: In this paper, we study the length distribution of k -mismatch common substrings between two sequences...
2017: Algorithms for Molecular Biology: AMB
https://www.readbyqxmd.com/read/29234460/generalized-enhanced-suffix-array-construction-in-external-memory
#13
Felipe A Louza, Guilherme P Telles, Steve Hoffmann, Cristina D A Ciferri
Background: Suffix arrays, augmented by additional data structures, allow solving efficiently many string processing problems. The external memory construction of the generalized suffix array for a string collection is a fundamental task when the size of the input collection or the data structure exceeds the available internal memory. Results: In this article we present and analyze [Formula: see text] [introduced in CPM (External memory generalized suffix and [Formula: see text] arrays construction...
2017: Algorithms for Molecular Biology: AMB
https://www.readbyqxmd.com/read/29026435/halign-ii-efficient-ultra-large-multiple-sequence-alignment-and-phylogenetic-tree-reconstruction-with-distributed-and-parallel-computing
#14
Shixiang Wan, Quan Zou
BACKGROUND: Multiple sequence alignment (MSA) plays a key role in biological sequence analyses, especially in phylogenetic tree construction. Extreme increase in next-generation sequencing results in shortage of efficient ultra-large biological sequence alignment approaches for coping with different sequence types. METHODS: Distributed and parallel computing represents a crucial technique for accelerating ultra-large (e.g. files more than 1 GB) sequence analyses...
2017: Algorithms for Molecular Biology: AMB
https://www.readbyqxmd.com/read/29021818/algorithms-for-matching-partially-labelled-sequence-graphs
#15
William R Taylor
BACKGROUND: In order to find correlated pairs of positions between proteins, which are useful in predicting interactions, it is necessary to concatenate two large multiple sequence alignments such that the sequences that are joined together belong to those that interact in their species of origin. When each protein is unique then the species name is sufficient to guide this match, however, when there are multiple related sequences (paralogs) in each species then the pairing is more difficult...
2017: Algorithms for Molecular Biology: AMB
https://www.readbyqxmd.com/read/28861118/biologically-feasible-gene-trees-reconciliation-maps-and-informative-triples
#16
Marc Hellmuth
BACKGROUND: The history of gene families-which are equivalent to event-labeled gene trees-can be reconstructed from empirically estimated evolutionary event-relations containing pairs of orthologous, paralogous or xenologous genes. The question then arises as whether inferred event-labeled gene trees are biologically feasible, that is, if there is a possible true history that would explain a given gene tree. In practice, this problem is boiled down to finding a reconciliation map-also known as DTL-scenario-between the event-labeled gene trees and a (possibly unknown) species tree...
2017: Algorithms for Molecular Biology: AMB
https://www.readbyqxmd.com/read/28852417/partially-local-three-way-alignments-and-the-sequence-signatures-of-mitochondrial-genome-rearrangements
#17
Marwa Al Arab, Matthias Bernt, Christian Höner Zu Siederdissen, Kifah Tout, Peter F Stadler
BACKGROUND: Genomic DNA frequently undergoes rearrangement of the gene order that can be localized by comparing the two DNA sequences. In mitochondrial genomes different mechanisms are likely at work, at least some of which involve the duplication of sequence around the location of the apparent breakpoints. We hypothesize that these different mechanisms of genome rearrangement leave distinctive sequence footprints. In order to study such effects it is important to locate the breakpoint positions with precision...
2017: Algorithms for Molecular Biology: AMB
https://www.readbyqxmd.com/read/28828033/a-hybrid-parameter-estimation-algorithm-for-beta-mixtures-and-applications-to-methylation-state-classification
#18
Christopher Schröder, Sven Rahmann
BACKGROUND: Mixtures of beta distributions are a flexible tool for modeling data with values on the unit interval, such as methylation levels. However, maximum likelihood parameter estimation with beta distributions suffers from problems because of singularities in the log-likelihood function if some observations take the values 0 or 1. METHODS: While ad-hoc corrections have been proposed to mitigate this problem, we propose a different approach to parameter estimation for beta mixtures where such problems do not arise in the first place...
2017: Algorithms for Molecular Biology: AMB
https://www.readbyqxmd.com/read/28814968/asp-based-method-for-the-enumeration-of-attractors-in-non-deterministic-synchronous-and-asynchronous-multi-valued-networks
#19
Emna Ben Abdallah, Maxime Folschette, Olivier Roux, Morgan Magnin
BACKGROUND: This paper addresses the problem of finding attractors in biological regulatory networks. We focus here on non-deterministic synchronous and asynchronous multi-valued networks, modeled using automata networks (AN). AN is a general and well-suited formalism to study complex interactions between different components (genes, proteins,...). An attractor is a minimal trap domain, that is, a part of the state-transition graph that cannot be escaped. Such structures are terminal components of the dynamics and take the form of steady states (singleton) or complex compositions of cycles (non-singleton)...
2017: Algorithms for Molecular Biology: AMB
https://www.readbyqxmd.com/read/28736575/identification-of-bifurcation-transitions-in-biological-regulatory-networks-using-answer-set-programming
#20
Louis Fippo Fitime, Olivier Roux, Carito Guziolowski, Loïc Paulevé
BACKGROUND: Numerous cellular differentiation processes can be captured using discrete qualitative models of biological regulatory networks. These models describe the temporal evolution of the state of the network subject to different competing transitions, potentially leading the system to different attractors. This paper focusses on the formal identification of states and transitions that are crucial for preserving or pre-empting the reachability of a given behaviour. METHODS: In the context of non-deterministic automata networks, we propose a static identification of so-called bifurcations, i...
2017: Algorithms for Molecular Biology: AMB
journal
journal
41260
1
2
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read
×

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"