keyword
MENU ▼
Read by QxMD icon Read
search

Alignment-free

keyword
https://www.readbyqxmd.com/read/28617225/a-greedy-alignment-free-distance-estimator-for-phylogenetic-inference
#1
Sharma V Thankachan, Sriram P Chockalingam, Yongchao Liu, Ambujam Krishnan, Srinivas Aluru
BACKGROUND: Alignment-free sequence comparison approaches have been garnering increasing interest in various data- and compute-intensive applications such as phylogenetic inference for large-scale sequences. While k-mer based methods are predominantly used in real applications, the average common substring (ACS) approach is emerging as one of the prominent alignment-free approaches. This ACS approach has been further generalized by some recent work, either greedily or exactly, by allowing a bounded number of mismatches in the common substrings...
June 7, 2017: BMC Bioinformatics
https://www.readbyqxmd.com/read/28610557/integration-of-quantitated-expression-estimates-from-polya-selected-and-rrna-depleted-rna-seq-libraries
#2
Stephen J Bush, Mary E B McCulloch, Kim M Summers, David A Hume, Emily L Clark
BACKGROUND: The availability of fast alignment-free algorithms has greatly reduced the computational burden of RNA-seq processing, especially for relatively poorly assembled genomes. Using these approaches, previous RNA-seq datasets could potentially be processed and integrated with newly sequenced libraries. Confounding factors in such integration include sequencing depth and methods of RNA extraction and selection. Different selection methods (typically, either polyA-selection or rRNA-depletion) omit different RNAs, resulting in different fractions of the transcriptome being sequenced...
June 13, 2017: BMC Bioinformatics
https://www.readbyqxmd.com/read/28605460/modmaps3d-an-interactive-webtool-for-the-quantification-and-3d-visualization-of-interrelationships-in-a-dataset-of-dna-sequences
#3
Rallis Karamichalis, Lila Kari
Summary: MoDMaps3D (Molecular Distance Maps 3D) is an alignment-free, fast, computationally lightweight webtool for computing and visualizing the interrelationships within any dataset of DNA sequences, based on pairwise comparisons between their oligomer compositions. MoDMaps3D is a general-purpose interactive webtool that is free of any requirements on sequence composition, position of the sequences in their respective genomes, presence or absence of similarity or homology, sequence length, or even sequence origin (biological or computer-generated)...
June 10, 2017: Bioinformatics
https://www.readbyqxmd.com/read/28587743/a-novel-alignment-free-vector-method-to-cluster-protein-sequences
#4
Lily He, Yongkun Li, Rong Lucy He, Stephen S-T Yau
Classification of protein are crucial topics in biology. The number of protein sequences stored in databases increases sharply in the past decade. Traditionally, comparison of protein sequences is usually carried out through multiple sequence alignment methods. However, these methods may be unsuitable for clustering of protein sequences when gene rearrangements occur such as in viral genomes. The computation is also very time-consuming for large datasets with long genomes. In this paper, based on three important biochemical properties of amino acids: the hydropathy index, polar requirement and chemical composition of the side chain, we propose a 24 dimensional feature vector describing the composition of amino acids in protein sequences...
June 3, 2017: Journal of Theoretical Biology
https://www.readbyqxmd.com/read/28566690/fastgt-an-alignment-free-method-for-calling-common-snvs-directly-from-raw-sequencing-reads
#5
Fanny-Dhelia Pajuste, Lauris Kaplinski, Märt Möls, Tarmo Puurand, Maarja Lepamets, Maido Remm
We have developed a computational method that counts the frequencies of unique k-mers in FASTQ-formatted genome data and uses this information to infer the genotypes of known variants. FastGT can detect the variants in a 30x genome in less than 1 hour using ordinary low-cost server hardware. The overall concordance with the genotypes of two Illumina "Platinum" genomes is 99.96%, and the concordance with the genotypes of the Illumina HumanOmniExpress is 99.82%. Our method provides k-mer database that can be used for the simultaneous genotyping of approximately 30 million single nucleotide variants (SNVs), including >23,000 SNVs from Y chromosome...
May 31, 2017: Scientific Reports
https://www.readbyqxmd.com/read/28541376/dna-sequence-shape-kernel-enables-alignment-free-modeling-of-transcription-factor-binding
#6
Wenxiu Ma, Lin Yang, Remo Rohs, William Stafford Noble
Motivation: Transcription factors (TFs) bind to specific DNA sequence motifs. Several lines of evidence suggest that TF-DNA binding is mediated in part by properties of the local DNA shape: the width of the minor groove, the relative orientations of adjacent base pairs, etc. Several methods have been developed to jointly account for DNA sequence and shape properties in predicting TF binding affinity. However, a limitation of these methods is that they typically require a training set of aligned TF binding sites...
May 24, 2017: Bioinformatics
https://www.readbyqxmd.com/read/28476562/a-novel-alignment-free-method-to-classify-protein-folding-types-by-combining-spectral-graph-clustering-with-chou-s-pseudo-amino-acid-composition
#7
Pooja Tripathi, Paras N Pandey
The present work employs pseudo amino acid composition (PseAAC) for encoding the protein sequences in their numeric form. Later this will be arranged in the similarity matrix, which serves as input for spectral graph clustering method. Spectral methods are used previously also for clustering of protein sequences, but they uses pair wise alignment scores of protein sequences, in similarity matrix. The alignment score depends on the length of sequences, so clustering short and long sequences together may not good idea...
May 3, 2017: Journal of Theoretical Biology
https://www.readbyqxmd.com/read/28472388/cafe-accelerated-alignment-free-sequence-analysis
#8
Yang Young Lu, Kujin Tang, Jie Ren, Jed A Fuhrman, Michael S Waterman, Fengzhu Sun
Alignment-free genome and metagenome comparisons are increasingly important with the development of next generation sequencing (NGS) technologies. Recently developed state-of-the-art k-mer based alignment-free dissimilarity measures including CVTree, $d_2^*$ and $d_2^S$ are more computationally expensive than measures based solely on the k-mer frequencies. Here, we report a standalone software, aCcelerated Alignment-FrEe sequence analysis (CAFE), for efficient calculation of 28 alignment-free dissimilarity measures...
May 3, 2017: Nucleic Acids Research
https://www.readbyqxmd.com/read/28472320/chimerscope-a-novel-alignment-free-algorithm-for-fusion-transcript-prediction-using-paired-end-rna-seq-data
#9
You Li, Tayla B Heavican, Neetha N Vellichirammal, Javeed Iqbal, Chittibabu Guda
The RNA-Seq technology has revolutionized transcriptome characterization not only by accurately quantifying gene expression, but also by the identification of novel transcripts like chimeric fusion transcripts. The 'fusion' or 'chimeric' transcripts have improved the diagnosis and prognosis of several tumors, and have led to the development of novel therapeutic regimen. The fusion transcript detection is currently accomplished by several software packages, primarily relying on sequence alignment algorithms...
May 2, 2017: Nucleic Acids Research
https://www.readbyqxmd.com/read/28430779/a-coevolution-analysis-for-identifying-protein-protein-interactions-by-fourier-transform
#10
Changchuan Yin, Stephen S-T Yau
Protein-protein interactions (PPIs) play key roles in life processes, such as signal transduction, transcription regulations, and immune response, etc. Identification of PPIs enables better understanding of the functional networks within a cell. Common experimental methods for identifying PPIs are time consuming and expensive. However, recent developments in computational approaches for inferring PPIs from protein sequences based on coevolution theory avoid these problems. In the coevolution theory model, interacted proteins may show coevolutionary mutations and have similar phylogenetic trees...
2017: PloS One
https://www.readbyqxmd.com/read/28422050/k-mer-content-correlation-and-position-analysis-of-genome-dna-sequences-for-the-identification-of-function-and-evolutionary-features
#11
Aaron Sievers, Katharina Bosiek, Marc Bisch, Chris Dreessen, Jascha Riedel, Patrick Froß, Michael Hausmann, Georg Hildenbrand
In genome analysis, k-mer-based comparison methods have become standard tools. However, even though they are able to deliver reliable results, other algorithms seem to work better in some cases. To improve k-mer-based DNA sequence analysis and comparison, we successfully checked whether adding positional resolution is beneficial for finding and/or comparing interesting organizational structures. A simple but efficient algorithm for extracting and saving local k-mer spectra (frequency distribution of k-mers) was developed and used...
April 19, 2017: Genes
https://www.readbyqxmd.com/read/28369524/ngscheckmate-software-for-validating-sample-identity-in-next-generation-sequencing-studies-within-and-across-data-types
#12
Sejoon Lee, Soohyun Lee, Scott Ouellette, Woong-Yang Park, Eunjung A Lee, Peter J Park
In many next-generation sequencing (NGS) studies, multiple samples or data types are profiled for each individual. An important quality control (QC) step in these studies is to ensure that datasets from the same subject are properly paired. Given the heterogeneity of data types, file types and sequencing depths in a multi-dimensional study, a robust program that provides a standardized metric for genotype comparisons would be useful. Here, we describe NGSCheckMate, a user-friendly software package for verifying sample identities from FASTQ, BAM or VCF files...
March 23, 2017: Nucleic Acids Research
https://www.readbyqxmd.com/read/28369270/dltree-efficient-and-accurate-phylogeny-reconstruction-using-the-dynamical-language-method
#13
Qi Wu, Zu-Guo Yu, Jianyi Yang
Summary: A number of alignment-free methods have been proposed for phylogeny reconstruction over the past two decades. But there are some long-standing challenges in these methods, including requirement of huge computer memory and CPU time, and existence of duplicate computations. In this article, we address these challenges with the idea of compressed vector, fingerprint and scalable memory management. With these ideas we developed the DLTree algorithm for efficient implementation of the dynamical language model and whole genome-based phylogenetic analysis...
March 29, 2017: Bioinformatics
https://www.readbyqxmd.com/read/28362738/self-eclipsing-alignment-free-vortex-coronagraphy
#14
Artur Aleksanyan, Etienne Brasselet
We report on a self-induced strategy to achieve high-contrast optical imaging, without the need for any man-made optical masks, which relies on the self-induced spin-to-orbital angular momentum conversion phenomenon. This is experimentally demonstrated by realizing a laboratory demonstration of self-eclipsing of a light source following the generation of a self-adapted vectorial optical vortex transmission mask. The proposed concept, namely the realization of an alignment-free optical vortex coronagraph, may inspire the development of future generations of smart astronomical imaging instruments...
April 1, 2017: Optics Letters
https://www.readbyqxmd.com/read/28350835/an-information-based-network-approach-for-protein-classification
#15
Xiaogeng Wan, Xin Zhao, Stephen S T Yau
Protein classification is one of the critical problems in bioinformatics. Early studies used geometric distances and polygenetic-tree to classify proteins. These methods use binary trees to present protein classification. In this paper, we propose a new protein classification method, whereby theories of information and networks are used to classify the multivariate relationships of proteins. In this study, protein universe is modeled as an undirected network, where proteins are classified according to their connections...
2017: PloS One
https://www.readbyqxmd.com/read/28345370/an-accurate-and-fast-alignment-free-method-for-profiling-microbial-communities
#16
Diem-Trang Pham, Shanshan Gao, Vinhthuy Phan
Determining abundances of microbial genomes in metagenomic samples is an important problem in analyzing metagenomic data. Although homology-based methods are popular, they have shown to be computationally expensive due to the alignment of tens of millions of reads from metagenomic samples to reference genomes of hundreds to thousands of environmental microbial species. We introduce an efficient alignment-free approach to estimate abundances of microbial genomes in metagenomic samples. The approach is based on solving linear and quadratic programs, which are represented by genome-specific markers (GSM)...
March 7, 2017: Journal of Bioinformatics and Computational Biology
https://www.readbyqxmd.com/read/28289437/best-hits-of-11110110111-model-free-selection-and-parameter-free-sensitivity-calculation-of-spaced-seeds
#17
Laurent Noé
BACKGROUND: Spaced seeds, also named gapped q-grams, gapped k-mers, spaced q-grams, have been proven to be more sensitive than contiguous seeds (contiguous q-grams, contiguous k-mers) in nucleic and amino-acid sequences analysis. Initially proposed to detect sequence similarities and to anchor sequence alignments, spaced seeds have more recently been applied in several alignment-free related methods. Unfortunately, spaced seeds need to be initially designed. This task is known to be time-consuming due to the number of spaced seed candidates...
2017: Algorithms for Molecular Biology: AMB
https://www.readbyqxmd.com/read/28245816/string-kernels-for-protein-sequence-comparisons-improved-fold-recognition
#18
Saghi Nojoomi, Patrice Koehl
BACKGROUND: The amino acid sequence of a protein is the blueprint from which its structure and ultimately function can be derived. Therefore, sequence comparison methods remain essential for the determination of similarity between proteins. Traditional approaches for comparing two protein sequences begin with strings of letters (amino acids) that represent the sequences, before generating textual alignments between these strings and providing scores for each alignment. When the similitude between the two protein sequences to be compared is low however, the quality of the corresponding sequence alignment is usually poor, leading to poor performance for the recognition of similarity...
February 28, 2017: BMC Bioinformatics
https://www.readbyqxmd.com/read/28187069/virtual-modeling-of-postoperative-alignment-following-adult-spinal-deformity-asd-surgery-helps-predict-associations-between-compensatory-spinopelvic-alignment-changes-overcorrection-and-proximal-junctional-kyphosis-pjk
#19
Renaud Lafage, Shay Bess, Steve Glassman, Christopher Ames, Doug Burton, Robert Hart, Han Jo Kim, Eric Klineberg, Jensen Henry, Breton Line, Justin Scheer, Themistocles Protopsaltis, Frank Schwab, Virginie Lafage
STUDY DESIGN: Retrospective review of a prospective multicenter database. OBJECTIVE: To develop a method to analyze sagittal alignment, free of PJK's influence, and then compare PJK to non-PJK patients using this method. SUMMARY OF BACKGROUND DATA: Proximal Junctional Kyphosis (PJK) following Adult Spinal Deformity (ASD) surgery remains problematic as it alters sagittal alignment. This study proposes a novel virtual modeling technique that attempts to eliminate the confounding effects of PJK on postoperative spinal alignment...
February 9, 2017: Spine
https://www.readbyqxmd.com/read/28158113/light-splitting-with-imperfect-wave-plates
#20
Jarom S Jackson, James L Archibald, Dallin S Durfee
We discuss the use of wave plates with arbitrary retardances, in conjunction with a linear polarizer, to split linearly polarized light into two linearly polarized beams with an arbitrary splitting fraction. We show that for non-ideal wave plates, a much broader range of splitting ratios is typically possible when a pair of wave plates, rather than a single wave plate, is used. We discuss the maximum range of splitting fractions possible with one or two wave plates as a function of the wave plate retardances, and how to align the wave plates to achieve the maximum splitting range possible when simply rotating one of the wave plates while keeping the other one fixed...
February 1, 2017: Applied Optics
keyword
keyword
57911
1
2
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read
×

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"