# Webinar series

Since the beginning of the pandemic, the Galaxy community has run two webinar series to highlight the importance of open science, showing how Galaxy is used to analyse SARS-CoV-2 data. The first webinar series took place in the spring of 2020 and the second one, at the beginning of 2021.

# First Galaxy-ELIXIR webinar series: FAIR data and Open Infrastructures to tackle the COVID-19 pandemic

April-May 2020

In a series of five webinar sessions, experts from ELIXIR and the Galaxy community in the US and Europe demonstrated how open access and open science are fundamental for fast and efficient response to public health crises. The focus was on research reproducibility and transparency, using exclusively open source tools and the Galaxy platform.

# Session 1: Introduction to Galaxy and the Galaxy workflows for SARS-CoV-2 data analysis

The first session introduced the Galaxy platform and other public research infrastructure to be used throughout the webinar series. It also explained the motivation behind the Galaxy COVID-19 projects and the benefits of open reproducible research and transparent and interoperable analytics.

# Speakers:

Anton Nekrutenko
Sergei Pond
Frederik Coppens
Björn Grüning

# Session 2: Genomics/Variant Calling

The second session presented the initial analysis of the SARS-CoV-2 genome, published on bioRxiv at that time. It guided the participants through accessing and collecting the available datasets, the genome assembly and the analysis of the within-sample sequence variants. It also explained how to deploy on a Galaxy instance all the tools and workflows needed to reproduce the analysis.

# Speakers:

Anton Nekrutenko
Wolfgang Maier
Marius van den Beek

# Session 3: Cheminformatics: Screening of the main protease

This session presented the Galaxy workflow to identify candidate molecules for COVID-19 drug treatment, using molecular docking simulation of the SARS-CoV-2 main protease. These simulations are used to predict the binding positions of the candidate molecules in the protease binding site, score the quality of each pose, and compare the results with experimental crystallographic data.

The computationally intensive workflow was executed through a distributed compute network available via the Galaxy Europe platform. The webinar presented methods and workflows for the identification of potential COVID-19 drug candidates. Special emphasis was given to the complex methods that have been applied and that have consumed more than 25 years of CPU and GPU time.

# Speakers:

Tim Dudgeon
Simon Bray

# Session 4: Evolution of the Virus

This session was focused on the mutations of SARS-CoV-2 during the outbreaks and the workflow for variant analysis.

# Speakers:

Sergei Pond

# Session 5: Behind the scenes: Global Open Infrastructures at work

This session presented the Pulsar network that connects data centres and High-Performance Computing clusters to share their computation power in support of the Galaxy users and provide examples of how to submit an analysis job from the users’ perspective.

# Speakers:

Gianmauro Cuccuru
Marco Antonio Tangaro
Simon Gladman
Nate Coraor
Frederik Coppens
Björn Grüning

# Second Galaxy-ELIXIR webinar series: Open Data Infrastructures to tackle COVID-19 pandemic

January-February 2021

In the second series of webinars, composed of six sessions, experts from ELIXIR and the global Galaxy community from the US, Australia and Europe met again to introduce the latest Galaxy tools developed for working with SARS-CoV-2 data and discuss some of the challenges in accessing, analysing and interpreting them.

# Session 1: COVID-19 analysis in Galaxy: Lessons learned and introduction to the series

The first session summarised the developments in Galaxy over the previous eight months and introduced the latest tools for COVID-19 related research:

WorkflowHub, Dockstore
Single-tool view, simplified workflow run-form
Scaling and performance improvements over the last 9 month

# Speakers:

Anton Nekrutenko
Marius van den Beek

# Session 2: Importance of (open) infrastructures in responding to a pandemic

The second session presented some of the data resources used to deposit and access SARS-CoV2 data. It also covered the critical importance of open access data to respond to epidemic outbreaks efficiently.

Data resources
Analysis resources
Interoperability

# Speakers:

Andrew Lonie
Guy Cochrane
Björn Grüning
Frederik Coppens
Nadim Rahman

# Session 3: Supporting the COVID-19 Data portal: viral data cleaning from human reads and submission to ENA

This session presented the COVID-19 Data Portal and the tools to clean and submit data to open access repositories:

Submission tool to European Nucleotide Archive (ENA)
Data preprocessing and cleaning

# Speakers:

Ignacio Eguinoa
Bert Droesbeke
Frederik Coppens
Miguel Roncoroni

# Session 4: Insights from selection analysis of complete genomes and read-level data

The fourth webinar showed the work over the last year on global genomics analysis of SARS-CoV-2.

# Speakers:

Sergei Pond

# Session 5: Viral Beacon and Galaxy variant workflows

In the fifth session, experts from the Viral Beacon and Galaxy projects showed the process from importing sequencing data from ENA to Galaxy, the variant calling workflows to analise it and the visualisation of the results in Viral Beacon.

# Speakers:

Bjoern Gruening
Babita Singh
Wolfgang Maier

# Session 6: DRS, long-read-sequencing, proteomics and more — an update to recent COVID-19 workflow developments

Finally, the last webinar gathered the different efforts for the caracterization of SARS-CoV-2.

# Speakers:

Milad Miladi
Nathan Roach
Pratik Jagtap
Subina Mehta