PGen: large-scale genomic variations analysis workflow and browser in SoyKB.

Yang Liu, Saad M Khan, Juexin Wang, Mats Rynge, Yuanxun Zhang, Shuai Zeng, Shiyuan Chen, Joao V Maldonado Dos Santos, Babu Valliyodan, Prasad P Calyam, Nirav Merchant, Henry T Nguyen, Dong Xu, Trupti Joshi

BMC Bioinformatics 2016 October 7

BACKGROUND: With the advances in next-generation sequencing (NGS) technology and significant reductions in sequencing costs, it is now possible to sequence large collections of germplasm in crops for detecting genome-scale genetic variations and to apply the knowledge towards improvements in traits. To efficiently facilitate large-scale NGS resequencing data analysis of genomic variations, we have developed "PGen", an integrated and optimized workflow using the Extreme Science and Engineering Discovery Environment (XSEDE) high-performance computing (HPC) virtual system, iPlant cloud data storage resources and Pegasus workflow management system (Pegasus-WMS). The workflow allows users to identify single nucleotide polymorphisms (SNPs) and insertion-deletions (indels), perform SNP annotations and conduct copy number variation analyses on multiple resequencing datasets in a user-friendly and seamless way.

RESULTS: We have developed both a Linux version in GitHub ( https://github.com/pegasus-isi/PGen-GenomicVariations-Workflow ) and a web-based implementation of the PGen workflow integrated within the Soybean Knowledge Base (SoyKB), ( https://soykb.org/Pegasus/index.php ). Using PGen, we identified 10,218,140 single-nucleotide polymorphisms (SNPs) and 1,398,982 indels from analysis of 106 soybean lines sequenced at 15X coverage. 297,245 non-synonymous SNPs and 3330 copy number variation (CNV) regions were identified from this analysis. SNPs identified using PGen from additional soybean resequencing projects adding to 500+ soybean germplasm lines in total have been integrated. These SNPs are being utilized for trait improvement using genotype to phenotype prediction approaches developed in-house. In order to browse and access NGS data easily, we have also developed an NGS resequencing data browser ( https://soykb.org/NGS_Resequence/NGS_index.php ) within SoyKB to provide easy access to SNP and downstream analysis results for soybean researchers.

CONCLUSION: PGen workflow has been optimized for the most efficient analysis of soybean data using thorough testing and validation. This research serves as an example of best practices for development of genomics data analysis workflows by integrating remote HPC resources and efficient data management with ease of use for biological users. PGen workflow can also be easily customized for analysis of data in other species.

Full text links

We have located links that may give you full text access.

Show additional links to paperHide additional links to paper

PubMed

Add to Saved Papers

Get 1-tap access

Related Resources

British Society of Gastroenterology guidelines for the management of hepatocellular carcinoma in adults.Abid Suddle et al.Gut 2024 April 17

Lung ultrasound for diagnosis and management of ARDS.Marry R Smit, Paul H Mayo, Silvia MongodiIntensive Care Medicine 2024 April 25

Systemic lupus erythematosus.Alberta Hoi et al.Lancet 2024 April 18

Should renin-angiotensin system inhibitors be held prior to major surgery?Matthieu LegrandBritish Journal of Anaesthesia 2024 May

Ventilator Waveforms May Give Clues to Expiratory Muscle Activity.Yi Chi, Huaiwu He, Yun LongAmerican Journal of Respiratory and Critical Care Medicine 2024 April 25

Acute Kidney Injury and Electrolyte Imbalances Caused by Dapagliflozin Short-Term Use.António Cabral Lopes et al.Pharmaceuticals 2024 March 27

Colorectal polypectomy and endoscopic mucosal resection: European Society of Gastrointestinal Endoscopy (ESGE) Guideline - Update 2024.Monika Ferlitsch et al.Endoscopy 2024 April 27

Drug Therapy for Acute and Chronic Heart Failure with Preserved Ejection Fraction with Hypertension: A State-of-the-Art Review.Hiroaki Hiraiwa et al.American Journal of Cardiovascular Drugs : Drugs, Devices, and Other Interventions 2024 April 5

Contrast-induced acute kidney injury: a review of definition, pathogenesis, risk factors, prevention and treatment.Yanyan Li, Junda WangBMC Nephrology 2024 April 23

For the best experience, use the Read mobile app

Get seemless 1-tap access through your institution/university

For the best experience, use the Read mobile app

All material on this website is protected by copyright, Copyright © 1994-2024 by WebMD LLC.
This website also contains material copyrighted by 3rd parties.

By using this service, you agree to our terms of use and privacy policy.

Your Privacy Choices

You can now claim free CME credits for this literature searchClaim now

Get seemless 1-tap access through your institution/university

For the best experience, use the Read mobile app

PGen: large-scale genomic variations analysis workflow and browser in SoyKB.

Full text links

Related Resources

Trending Papers

For the best experience, use the Read mobile app