# Live Resources
# Data pre-processing
Due to the nature of SARS-CoV-2 transcriptions, the sub-genomic RNAs (sgRNA) that encode the structural proteins are overlapping with each other. This requests for special care in mapping the sequencing data and correct assignments of the reads to the associated ORF and sgRNA. The pre-processing workflow (opens new window) performs the alignment step for the whole datasets, and follows up with categorizing the reads based on the identified ORF origin of the transcript.