We have located links that may give you full text access.
High efficiency referential genome compression algorithm.
Bioinformatics 2018 November 9
Motivation: With the development and the gradually popularized application of next-generation sequencing technologies (NGS), genome sequencing has been becoming faster and cheaper, creating a massive amount of genome sequence data which still grows at an explosive rate. The time and cost of transmission, storage, processing and analysis of these genetic data have become bottlenecks that hinder the development of genetics and biomedicine. Although there are many common data compression algorithms, they are not effective for genome sequences due to their inability to consider and exploit the inherent characteristics of genome sequence data. Therefore, the development of a fast and efficient compression algorithm specific to genome data is an important and pressing issue.
Results: We have developed a referential lossless genome data compression algorithm with better performance than previous algorithms. According to a carefully designed matching strategy selection mechanism, the advantages of local matching and global matching are reasonably combined together to improve the description efficiency of the matched sub-strings. The effects of the length and the position of matched sub-strings to the compression efficiency are jointly taken into consideration. The proposed algorithm can compress the FASTA data of complete human genomes, each of which is about 3G, in about 18 minutes. The compressed file sizes are ranging from dozens of megabytes to about 2 hundred megabytes. The averaged compression ratio is higher than that of the state-of-the-art genome compression algorithms, the time complexity is at the same order of the best-known algorithms.
Availability: https://github.com/jhchen5/SCCG.
Supplementary information: Supplementary data are available at Bioinformatics online.
Results: We have developed a referential lossless genome data compression algorithm with better performance than previous algorithms. According to a carefully designed matching strategy selection mechanism, the advantages of local matching and global matching are reasonably combined together to improve the description efficiency of the matched sub-strings. The effects of the length and the position of matched sub-strings to the compression efficiency are jointly taken into consideration. The proposed algorithm can compress the FASTA data of complete human genomes, each of which is about 3G, in about 18 minutes. The compressed file sizes are ranging from dozens of megabytes to about 2 hundred megabytes. The averaged compression ratio is higher than that of the state-of-the-art genome compression algorithms, the time complexity is at the same order of the best-known algorithms.
Availability: https://github.com/jhchen5/SCCG.
Supplementary information: Supplementary data are available at Bioinformatics online.
Full text links
Related Resources
Trending Papers
Heart failure with preserved ejection fraction: diagnosis, risk assessment, and treatment.Clinical Research in Cardiology : Official Journal of the German Cardiac Society 2024 April 12
Proximal versus distal diuretics in congestive heart failure.Nephrology, Dialysis, Transplantation 2024 Februrary 30
World Health Organization and International Consensus Classification of eosinophilic disorders: 2024 update on diagnosis, risk stratification, and management.American Journal of Hematology 2024 March 30
Efficacy and safety of pharmacotherapy in chronic insomnia: A review of clinical guidelines and case reports.Mental Health Clinician 2023 October
Get seemless 1-tap access through your institution/university
For the best experience, use the Read mobile app
All material on this website is protected by copyright, Copyright © 1994-2024 by WebMD LLC.
This website also contains material copyrighted by 3rd parties.
By using this service, you agree to our terms of use and privacy policy.
Your Privacy Choices
You can now claim free CME credits for this literature searchClaim now
Get seemless 1-tap access through your institution/university
For the best experience, use the Read mobile app