Read by QxMD icon Read


Lauren Coombe, Jessica Zhang, Benjamin P Vandervalk, Justin Chu, Shaun D Jackman, Inanc Birol, René L Warren
BACKGROUND: The long-range sequencing information captured by linked reads, such as those available from 10× Genomics (10xG), helps resolve genome sequence repeats, and yields accurate and contiguous draft genome assemblies. We introduce ARKS, an alignment-free linked read genome scaffolding methodology that uses linked reads to organize genome assemblies further into contiguous drafts. Our approach departs from other read alignment-dependent linked read scaffolders, including our own (ARCS), and uses a kmer-based mapping approach...
June 20, 2018: BMC Bioinformatics
Atif Rahman, Ingileif Hallgrímsdóttir, Michael Eisen, Lior Pachter
Genome wide association studies (GWAS) rely on microarrays, or more recently mapping of sequencing reads, to genotype individuals. The reliance on prior sequencing of a reference genome limits the scope of association studies, and also precludes mapping associations outside of the reference. We present an alignment free method for association studies of categorical phenotypes based on counting k -mers in whole-genome sequencing reads, testing for associations directly between k -mers and the trait of interest, and local assembly of the statistically significant k -mers to identify sequence differences...
June 13, 2018: ELife
Eric Augusto Ito, Isaque Katahira, Fábio Fernandes da Rocha Vicente, Luiz Filipe Protasio Pereira, Fabrício Martins Lopes
With the emergence of Next Generation Sequencing (NGS) technologies, a large volume of sequence data in particular de novo sequencing was rapidly produced at relatively low costs. In this context, computational tools are increasingly important to assist in the identification of relevant information to understand the functioning of organisms. This work introduces BASiNET, an alignment-free tool for classifying biological sequences based on the feature extraction from complex network measurements. The method initially transform the sequences and represents them as complex networks...
June 5, 2018: Nucleic Acids Research
Rosemarie Weikard, Frieder Hadlich, Harald M Hammon, Doerte Frieten, Caroline Gerbert, Christian Koch, Georg Dusel, Christa Kuehn
Long noncoding RNAs (lncRNAs) emerged as important regulatory component of mechanisms involved in gene expression, chromatin modification and epigenetic processes, but they are rarely annotated in the bovine genome. Our study monitored the jejunum transcriptome of German Holstein calves fed two different milk diets using transcriptome sequencing (RNA-seq). To identify potential lncRNAs within the pool of unknown transcripts, four bioinformatic lncRNA prediction tools were applied. The intersection of the alignment-free lncRNA prediction tools (CNCI, PLEK and FEELnc) predicted 1,812 lncRNA transcripts concordantly comprising a catalogue of 1,042 putative lncRNA loci expressed in the calves' intestinal mucosa...
April 20, 2018: Oncotarget
Alex Di Genova, Gonzalo A Ruz, Marie-France Sagot, Alejandro Maass
Background: Long read sequencing technologies are the ultimate solution for genome repeats, allowing near reference level reconstructions of large genomes. However, long read de novo assembly pipelines are computationally intense and require a considerable amount of coverage, thereby hindering their broad application to the assembly of large genomes. Alternatively, hybrid assembly methods which combine short and long read sequencing technologies can reduce the time and cost required to produce de novo assemblies of large genomes...
May 5, 2018: GigaScience
Gleb Filatov, Bruno Bauwens, Attila Kertész-Farkas
Motivation: Bioinformatics studies often rely on similarity measures between sequence pairs, which often pose a bottleneck in large-scale sequence analysis. Results: Here, we present a new convolutional kernel function for protein sequences called the LZW-Kernel. It is based on code words identified with the Lempel-Ziv-Welch (LZW) universal text compressor. The LZW-Kernel is an alignment-free method, it is always symmetric, is positive, always provides 1.0 for self-similarity and it can directly be used with Support Vector Machines (SVMs) in classification problems, contrary to normalized compression distance (NCD), which often violates the distance metric properties in practice and requires further techniques to be used with SVMs...
May 7, 2018: Bioinformatics
Deborah Galpert, Alberto Fernández, Francisco Herrera, Agostinho Antunes, Reinaldo Molina-Ruiz, Guillermin Agüero-Chapin
BACKGROUND: The development of new ortholog detection algorithms and the improvement of existing ones are of major importance in functional genomics. We have previously introduced a successful supervised pairwise ortholog classification approach implemented in a big data platform that considered several pairwise protein features and the low ortholog pair ratios found between two annotated proteomes (Galpert, D et al., BioMed Research International, 2015). The supervised models were built and tested using a Saccharomycete yeast benchmark dataset proposed by Salichos and Rokas (2011)...
May 3, 2018: BMC Bioinformatics
Jie Lin, Jing Wei, Donald Adjeroh, Bing-Hua Jiang, Yue Jiang
BACKGROUND: Alignment-free sequence similarity analysis methods often lead to significant savings in computational time over alignment-based counterparts. RESULTS: A new alignment-free sequence similarity analysis method, called SSAW is proposed. SSAW stands for Sequence Similarity Analysis using the Stationary Discrete Wavelet Transform (SDWT). It extracts k-mers from a sequence, then maps each k-mer to a complex number field. Then, the series of complex numbers formed are transformed into feature vectors using the stationary discrete wavelet transform...
May 2, 2018: BMC Bioinformatics
Benjamin T James, Brian B Luczak, Hani Z Girgis
Sequence clustering is a fundamental step in analyzing DNA sequences. Widely-used software tools for sequence clustering utilize greedy approaches that are not guaranteed to produce the best results. These tools are sensitive to one parameter that determines the similarity among sequences in a cluster. Often times, a biologist may not know the exact sequence similarity. Therefore, clusters produced by these tools do not likely match the real clusters comprising the data if the provided parameter is inaccurate...
May 1, 2018: Nucleic Acids Research
Makio Yokono, Soichirou Satoh, Ayumi Tanaka
Phylogenies based on entire genomes are a powerful tool for reconstructing the Tree of Life. Several methods have been proposed, most of which employ an alignment-free strategy. Average sequence similarity methods are different than most other whole-genome methods, because they are based on local alignments. However, previous average similarity methods fail to reconstruct a correct phylogeny when compared against other whole-genome trees. In this study, we developed a novel average sequence similarity method...
May 1, 2018: Scientific Reports
Jarosław Sotor, Tadeusz Martynkien, Peter G Schunemann, Paweł Mergo, Lucile Rutkowski, Grzegorz Soboń
We report the first fully fiberized difference frequency generation (DFG) source, delivering a broadly tunable idler in the 6 to 9 μm spectral range, using an orientation-patterned gallium phosphide (OP-GaP) crystals with different quasi-phase matching periods (QPM). The mid-infrared radiation (MIR) is obtained via mixing of the output of a graphene-based Er-doped fiber laser at 1.55 μm with coherent frequency-shifted solitons at 1.9 μm generated in a highly nonlinear fiber using the same seed. The presented setup is the first truly all-fiber, all-polarization maintaining, alignment-free DFG source reported so far...
April 30, 2018: Optics Express
Kujin Tang, Yang Young Lu, Fengzhu Sun
Horizontal gene transfer (HGT) plays an important role in the evolution of microbial organisms including bacteria. Alignment-free methods based on single genome compositional information have been used to detect HGT. Currently, Manhattan and Euclidean distances based on tetranucleotide frequencies are the most commonly used alignment-free dissimilarity measures to detect HGT. By testing on simulated bacterial sequences and real data sets with known horizontal transferred genomic regions, we found that more advanced alignment-free dissimilarity measures such as CVTree and [Formula: see text] that take into account the background Markov sequences can solve HGT detection problems with significantly improved performance...
2018: Frontiers in Microbiology
Saeedeh Rahimi Farahani, Mahmoud Reza Sohrabi, Jahan B Ghasemi
In the present study, a very thorough and in-depth three-dimensional quantitative structure-toxicity relationship (3D-QSTR) analysis has been implemented to make a correlation between the structural information of the ionic liquids (ILs) and their cytotoxicity towards Leukemia rat cell line IPC-81, as one of the ILs' toxicological consequences. To do this, alignment free GRid-INdependent Descriptors (GRINDs), which were derived from molecular interaction fields (MIFs), were correlated to the cytotoxicity values by partial least squares (PLS) and support vector regression (SVR)...
August 30, 2018: Ecotoxicology and Environmental Safety
Chirag Jain, Alexander Dilthey, Sergey Koren, Srinivas Aluru, Adam M Phillippy
Emerging single-molecule sequencing technologies from Pacific Biosciences and Oxford Nanopore have revived interest in long-read mapping algorithms. Alignment-based seed-and-extend methods demonstrate good accuracy, but face limited scalability, while faster alignment-free methods typically trade decreased precision for efficiency. In this article, we combine a fast approximate read mapping algorithm based on minimizers with a novel MinHash identity estimation technique to achieve both scalability and precision...
April 30, 2018: Journal of Computational Biology: a Journal of Computational Molecular Cell Biology
Daniele Raimondi, Gabriele Orlando, Yves Moreau, Wim F Vranken
Motivation: Evolutionary information is crucial for the annotation of proteins in bioinformatics. The amount of retrieved homologs often correlates with the quality of predicted protein annotations related to structure or function. With a growing amount of sequences available, fast and reliable methods for homology detection are essential, as they have a direct impact on predicted protein annotations. Results: We developed a discriminative, alignment-free algorithm for homology detection with quasi-linear complexity, enabling theoretically much faster homology searches...
April 19, 2018: Bioinformatics
Prachi Mehrotra, Vimla Kany G Ami, Narayanaswamy Srinivasan
The overall function of a multi-domain protein is determined by the functional and structural interplay of its constituent domains. Traditional sequence alignment-based methods commonly utilize domain-level information and provide classification only at the level of domains. Such methods are not capable of taking into account the contributions of other domains in the proteins, and domain-linker regions and classify multi-domain proteins. An alignment-free protein sequence comparison tool, CLAP (CLAssification of Proteins) was previously developed in our laboratory to especially handle multi-domain protein sequences without a requirement of defining domain boundaries and sequential order of domains...
April 20, 2018: Proteins
Hervé Seligmann
Genetic codes mainly evolve by reassigning punctuation codons, starts and stops. Previous analyses assuming that undefined amino acids translate stops showed greater divergence between nuclear and mitochondrial genetic codes. Here, three independent methods converge on which amino acids translated stops at split between nuclear and mitochondrial genetic codes: (a) alignment-free genetic code comparisons inserting different amino acids at stops; (b) alignment-based blast analyses of hypothetical peptides translated from non-coding mitochondrial sequences, inserting different amino acids at stops; (c) biases in amino acid insertions at stops in proteomic data...
May 2018: Bio Systems
Zexiao Li, Xianlei Liu, Fengzhou Fang, Xiaodong Zhang, Zhen Zeng, Linlin Zhu, Ning Yan
Multi-reflective imaging systems find wide applications in optical imaging and space detection. However, it is faced with difficulties in adjusting the freeform mirrors with high accuracy to guarantee the optical function. Motivated by this, an alignment-free manufacture approach is proposed to machine the optical system. The direct optical performance-guided manufacture route is established without measuring the form error of freeform optics. An analytical model is established to investigate the effects of machine errors to serve the error identification and compensation in machining...
March 19, 2018: Optics Express
Jayanta Kumar Das, Pabitra Pal Choudhury, Neelambuj Chaturvedi, Mohd Tayyab, Sk Sarif Hassan
This article introduces an alignment-free clustering method in order to cluster all the 66 DORs sequentially diverse protein sequences. Two different methods are discussed: one is utilizing twenty standard amino acids (without grouping) and another one is using chemical grouping of amino acids (with grouping). Two grayscale images (representing two protein sequences by order pair frequency matrices) are compared to find the similarity index using morphology technique. We could achieve the correlation coefficients of 0...
March 13, 2018: Genomics
Xiaoyu Yu, Oleg N Reva
Modern phylogenetic studies may benefit from the analysis of complete genome sequences of various microorganisms. Evolutionary inferences based on genome-scale analysis are believed to be more accurate than the gene-based alternative. However, the computational complexity of current phylogenomic procedures, inappropriateness of standard phylogenetic tools to process genome-wide data, and lack of reliable substitution models which correlates with alignment-free phylogenomic approaches deter microbiologists from using these opportunities...
2018: Evolutionary Bioinformatics Online
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"