SARS-CoV-2 bioinformatics pipeline – GMS Arctic
Funding given for project project entitled:
Joint contributions from 3 PLP projects: Rapid establishment of comprehensive laboratory pandemic preparedness – RAPID-SEQ (PLP1 capability), Genomic Pandemic Preparedness Portfolio (G3P) (PLP1 capability), and Next generation clinical virology (PLP TDP project).
PI(s)/Head responsible for the resource:
Jan Albert (RAPID-SEQ), Valtteri Wirta (G3P), Tobias Allander (Next generation clinical virology)
This is a jointly developed resource with multiple contributers; Karolinska Institutet, Karolinska University Hospital, Region Östergötland, SciLifeLab, Genomics Medicine Sweden.
A pipeline for bioinformatical analysis of SARS-CoV-2 data was developed within Genomics Medicine Sweden during the spring of 2021. The pipeline extends on the analysis developed by the international collaboration ArticNetwork and the COG-UK consortium. It is currently in clinical use across Sweden and is continually updated with the latest SARS-CoV-2 variants.
The pipeline performs typing of SARS-CoV-2 data using both the Pangolin and Nextstrain classification systems. In addition, it also runs quality control, variant calling and supplies a genome consensus sequence for each sample. The pipeline has two different modes depending on the sequencing technology used, one for short read data such as Illumina sequencing data, and the other for long read data generated from Nanopore sequencing.
The pipeline is written in Nextflow workflow manager and is already set up to run with conda environments, docker or singularity containers. Support has also been set up for execution with various job schedulers such as slurm, lsf, gls and sge, and is easily adapted for other systems through the Nextflow set up.
For more information on the Pandemic Laboratory Preparedness resources associated with this subproject, see Genomic Pandemic Preparedness Portfolio (G3P), Rapid establishment of comprehensive laboratory pandemic preparedness – RAPID-SEQ, and Next generation clinical virology. Please also refer to other associated subprojects; taxprofiler and SC2 Reporter.
How this resource can be used for Pandemic Preparedness research:
The pipeline is publicly available for analyzing SARS-CoV-2 samples generated on various sequencing platforms. This includes short read data, such as Illumina sequence data, as well as long read data from Nanopore sequencing.
Who and under which conditions is able to access the resource:
The code for the pipeline is publicly available and can be utilized free of charge.
Available data, code, and protocols from the resource:
All code related GMS-Arctic is available on GitHub.
SOPs, guidelines, publications, etc. describing the resource:
Instructions about how to use the GMS-Arctic, how to set it up, and requirements are available on GitHub.