The constant emergence of COVID-19 variants reduces the effectiveness of existing vaccines and test kits. Therefore, it is critical to identify conserved structures in SARS-CoV-2 genomes as potential targets for variant-proof diagnostics and therapeutics. However, the algorithms to predict these conserved structures, which simultaneously fold and align multiple RNA homologs, scale at best cubically with sequence length and are thus infeasible for coronaviruses, which possess the longest genomes (∼30,000 nt) among RNA viruses. As a result, existing efforts on modeling SARS-CoV-2 structures resort to single-sequence folding as well as local folding methods with short window sizes, which inevitably neglect long-range interactions that are crucial in RNA functions.
Here we present LinearTurboFold, a linear-time algorithm for folding RNA homologs that scales linearly with sequence length, enabling unprecedented global structural analysis on SARS-CoV-2. Furthermore, LinearTurboFold identifies undiscovered conserved structures and conserved accessible regions as potential targets for designing efficient and mutation-insensitive small-molecule drugs, antisense oligonucleotides, small interfering RNAs (siRNAs), CRISPR-Cas13 guide RNAs, and RT-PCR primers. LinearTurboFold is a general technique that can also be applied to other RNA viruses and full-length genome studies and will be a useful tool in fighting the current and future pandemics.
This work has been published at Proceedings of National Academy of Sciences, marking the first PNAS paper ever from OSU EECS.