SARS-CoV-2 Data Analysis and Monitoring with Galaxy

training COVID-19

From 2021-08-09 to 2021-08-12 - Add to your Calendar

The goal of this workshop is to build capacity in SARS-CoV-2 data analysis and data management, including data submission to ENA. After the workshop, all participants will be able to upload viral sequencing data, call all variants, create a variety of reports and create consensus alignments.

It will be a 4-day event introducing scalable and reproducible SARS-CoV-2 data analysis with Galaxy. The sessions will be pre-recorded and provided in advance. During the workshop, there will be live support in chat and live Q&A sessions, in which experts will answer questions.



  • WHEN: August 9-12, 2021
  • WHO: Open for everybody, but the target audience is clinicians and researchers that deal with SARS-CoV-2 sequencing data.
  • COST: Free.
  • FORMAT: Virtual and asynchronous. All training sessions will be pre-recorded and provided in advance.
  • SUPPORT:
    • Live support in chat (Slack Channel), in which experts will answer questions on a peer-to-peer basis.
    • Real-Time Q&A sessions on days 2, 3 and 4 (9 am and 4 pm CEST).
  • INFRASTRUCTURE: European Galaxy server and the Galaxy Training Material. Both will stay accessible and open after the training.
  • Contact us if you have questions.

sars-cov-2-workshop


Register now


Program & Material

This workshop is virtual and asynchronous. Training sessions are pre-recorded and the material provided in advance in the program below.

Whenever you’re ready to get started, you can access the material by clicking in the program on the different icons:

Day 1 (9.8.2021) - Introduction to Galaxy

Topic Speaker Material Description Duration
A very short introduction to Galaxy Anton Nekrutenko / Lecture: This video will introduce the Galaxy data analysis platform, and give a short demo on how to use it. 10m
Galaxy 101 Anton Nekrutenko /
Hands-on: This tutorial will introduce you to Galaxy. You will familiarize yourself with tools, workflows and histories. Those skills will be needed the next days. 1h / 13m
NGS data logistics Anton Nekrutenko /
Hands-on: Learn how to manipulate and process NGS data data derived from patients infected with SARS-CoV-2. Get familiar with quality control, mapping and NGS filetypes. 1h 30m / 12m

Day 2 (10.8.2021) - Data Upload & Quality Control

Topic Speaker Material Description Duration
  All experts Info & Zoom link Real-Time Q&A session (9 am CEST) 1h
Quality control of reads Florian Heyl / Lecture: This lecture goes over the concepts involved in assessing the quality of your sequencing data. 38m
Quality control of reads Florian Heyl / Hands-on: In this tutorial you will get some hand-on experience performing a quality assessment on sequencing data. 1h 30m / 1h 10m
Mapping of reads Peter van Heusden / Lecture: This lecture covers the basic concepts involved in mapping sequencing reads to a reference genome. 10m
Mapping of reads Peter van Heusden /
Hands-on: In this tutorial you will map sequencing data to a reference genome, and explore the mapped reads in a genome browser. 1h / 20m
Using dataset collections Anton Nekrutenko /
Hands-on: How to manipulate large numbers of datasets at once? This will be needed to process 100 of SARS-CoV-2 samples in one go. 30m / 12m
Data cleaning workflow Wolfgang Maier
Hands-on: As a first exercise in actual SARS-CoV-2 data analysis with Galaxy, this tutorial will let you perform the steps necessary to remove contaminating human reads from sequencing data of SARS-CoV-2 isolates. 1h
  All experts Info & Zoom link Real-Time Q&A session (4 pm CEST) 1h

Day 3 (11.8.2021) - SARS-CoV-2 Data Analysis on Public Datasets

Topic Speaker Material Description Duration
  All experts Info & Zoom link Real-Time Q&A session (9 am CEST) 1h
Galaxy for SARS-CoV-2 genome surveillance projects Wolfgang Maier / Lecture: Get an overview of what day 3 has to offer: production-ready Galaxy workflows for SARS-CoV-2 sequencing data, tools you should know to automate workflow execution, and how you combine all of it to turn Galaxy into a platform for genome-surveillance. 14m
Variant calling, reporting, consensus building (with Galaxy GUI) Wolfgang Maier /
Part I / Complete
Hands-on: Illumina or ONT, ampliconic or WGS data? Learn how to combine the right set of Galaxy workflows to analyze the type of SARS-CoV-2 sequencing data of your choice. 3h / 1h 30m
Variant calling, reporting, consensus building (with Galaxy CLI) Simon Bray / Hands-on: Learn how to use the command line to upload your SARS-CoV-2 data to a Galaxy-server and launch workflows for its analysis. Note: This first step towards automation requires the command line tool Planemo for interacting with Galaxy if you want to follow along. 2h / 30m
The usegalaxy.eu SARS-CoV-2 bot in action Wolfgang Maier Demo: See in this demo how, on usegalaxy.*, we’ve used Planemo and Bioblend to build and operate an automated SARS-CoV-2 genome surveillance system based on the Galaxy workflows for variant calling, consensus building and reporting. 40m
  All experts Info & Zoom link Real-Time Q&A session (4 pm CEST) 1h

Day 4 (12.8.2021) - Visualisation & Data Export

Topic Speaker Material Description Duration
  All experts Info & Zoom link Real-Time Q&A session (9 am CEST) 1h
Accelerating Research Through Data Sharing Carla Cummins Lecture: Accelerating Research Through Data Sharing 13m
Upload data to ENA Miguel Roncoroni / / Demo: Learn how to submit your sequencing data to the ENA directly from Galaxy. 1h / 10m
Upload data to a local datastore Wolfgang Maier Demo: So you’ve used Galaxy workflows to analyze your SARS-CoV-2 samples? Learn in this tutorial how to export results to your favorite datastore. 10m
Introduction to viral Beacon Babita Singh / Demo: How to visualize tens of thousands of SARS-CoV-2 analysis results? Learn about the Viral Beacon project’s solution! 24m
Using and Customising ObservableHQ Sergei Pond Demo: In this demo you will get to know the ObservableHQ platform for interactive data visualization. You will see how covid19.galaxyproject.org uses it to build a dashboard for their SARS-CoV-2 analysis efforts and will learn how to customize this solution to fit your own purposes. 15m
  All experts Info & Zoom link Real-Time Q&A session (4 pm CEST) 1h

Optional extra training

Topic Speaker Material Description Duration
SRA Aligned Read Formats to Speed Up SARS-CoV-2 data Analysis Jonathan Trow / Lecture: This lecture will introduce the SRA Aligned Read format available in the cloud from SRA, as well as some accompanying metadata that can help you search and filter the data. This sessions is aimed specifically at SARS-CoV-2 runs in SRA. 15m
SRA Aligned Read Formats to Speed Up SARS-CoV-2 data Analysis Jonathan Trow / Hands-on: This tutorials will walk you through accessing and using SRA Aligned read format in Galaxy. 40m
Assembly: Unicycler assembly of SARS-CoV-2 genome Cristóbal Gallardo Lecture: Unicycler assembly of SARS-CoV-2 genome with preprocessing to remove human genome reads  
Assembly: Unicycler assembly of SARS-CoV-2 genome Cristóbal Gallardo / Hands-on: Unicycler assembly of SARS-CoV-2 genome with preprocessing to remove human genome reads 25m
Pandemics Research using Mass Spectrometry Timothy J. Griffin, Subina Mehta, Andrew Rajczewski, Pratik Jagtap / Demo: Learn about pandemic research using mass spectrometry. 35m
Scripting Galaxy using the API and BioBlend Nicola Soranzo Lecture: Learn how to control Galaxy via a Python API.  
What you can do with SARS-COV-2 data: Case studies Andrew Page Lecture: Learn what you can do with SARS-CoV-2 data. 37m

Logistics

Content delivery

This is a global workshop delivered asynchronously. In practice, this means that you will have training materials available to explore them at your own pace, without any time constraints:

  • Lectures: pre-recorded videos () with the theoretical explanation of the lesson, supported by slide decks ().
  • Hands-on tutorials (): a step-by-step explanation, including all the required information, to perform a data analysis, often available also as pre-recorded video ().

    Most of the tutorials are developed by the Galaxy Training Network. A feedback form is available at the bottom of each tutorial page. Please fill it out, it helps us to value and improve the tutorials.

  • Histories: shared Galaxy history (), on the European Galaxy server, with all that you need to reproduce what is shown in the hands-on part.
  • Demo: pre-recorded videos () demonstrating a technical point or a nice feature.

Most of the material is available already, and they will all stay available after the workshop. Most of the material have been developed by a community of people via the Galaxy Training Network. Some videos were recorded for different previous events, e.g. GTN Smörgåsbord or GCC2021 Training Week, and the captions were manually-curated by several community members.

Whenever you’re ready to get started, you can access the material by clicking on the different icons in the program!

Doing the tutorials - Technical requirements

Some of you have asked about the technical requirements. You don’t need a specific operating system or software installed, all you need is a browser and internet connection.

To run the tutorials, you will need a Galaxy account. We recommend you to:

Support & Communication channels

Should you have any questions, the instructors will be available in chat. We will use the Slack space of the Galaxy Training Network. Depending on your location you might need to use a VPN, so please make sure that you can join Slack before the workshop.

Once you are in, you will see different channels (#general, #covid2021-day- for the different days) but also #random, #social. Pass by and say hi to your colleagues! Every day we will have an icebreaker question for the #social channel.

When asking a question:

  • Ask in the appropriate place:
    • #general for general issues
    • #event-covid2021-day-1 for day 1 (Introduction to Galaxy)
    • #event-covid2021-day-2 for day 2 (Data Upload & Quality Control)
    • #event-covid2021-day-3 for day 3 (SARS-CoV-2 Data Analysis on Public Datasets)
    • #event-covid2021-day-4 for day 4 (Visualisation & Data Export)
  • Use threads.
  • Say which server you’re using.
  • Share all of the details (What did the tool say? What was the error? Did you see more information in the bug-report icon?)

During the week of the workshop, the instructors will be there to reply to your questions. Please be aware of the time zones, the instructors are scattered all over the world and sometimes you may have to be patient to get a reply.

Real-time Q&A sessions will be run on days 2, 3 and 4 (9 am and 4 pm CEST). Find the details to join these sessions and register by adding your name to the attendees list in the dedicated document.

Certificates

If you need a certificate, you can request it at the end of the workshop. Please make sure to keep all the work over the 4 days, stay active in the discussions and fill out the final survey.

Code of Conduct

Everyone is expected to abide by the Code of Conduct (CoC) to make this environment welcoming and friendly for everyone.

Instructors & helpers

Name Location
Wolfgang Maier Germany
Bérénice Batut Germany
Beatriz Serrano-Solano Germany
Engy Nasr Germany
Simon Bray Germany
Florian Heyl Germany
Björn Grüning Germany
Anton Nekrutenko USA
Andrew Page UK
Carla Cummins UK
Peter van Heusden South Africa
Erik Hjerde Norway
Annbjørg Barbakken Norway
Kjell Petersen Norway
Steven Morgan Australia
Gareth Price Australia
Anna Syme Australia
Igor Makunin Australia
Valentine Murigneux Australia
Michael Thang Australia