Expansion of known ssRNA phage genomes: from tens to over a thousand

Thumbnail Image
sciadv.aay5981.pdf(1.28 MB)
Published Version
Callanan, Julie
Stockdale, Stephen R.
Stockdale, Stephen R.
Draper, Lorraine A.
Ross, R. Paul
Hill, Colin
Journal Title
Journal ISSN
Volume Title
American Association for the Advancement of Science
Published Version
Research Projects
Organizational Units
Journal Issue
The first sequenced genome was that of the 3569-nucleotide single-stranded RNA (ssRNA) bacteriophage MS2. Despite the recent accumulation of vast amounts of DNA and RNA sequence data, only 12 representative ssRNA phage genome sequences are available from the NCBI Genome database (June 2019). The difficulty in detecting RNA phages in metagenomic datasets raises questions as to their abundance, taxonomic structure, and ecological importance. In this study, we iteratively applied profile hidden Markov models to detect conserved ssRNA phage proteins in 82 publicly available metatranscriptomic datasets generated from activated sludge and aquatic environments. We identified 15,611 nonredundant ssRNA phage sequences, including 1015 near-complete genomes. This expansion in the number of known sequences enabled us to complete a phylogenetic assessment of both sequences identified in this study and known ssRNA phage genomes. Our expansion of these viruses from two environments suggests that they have been overlooked within microbiome studies
ssRNA phage genomes , Metatranscriptomic datasets
Callanan, J., Stockdale, S.R., Shkoporov, A., Draper, L.A., Ross, R.P. and Hill, C. (2020) ‘Expansion of known ssRNA phage genomes: From tens to over a thousand’, Science Advances, 6(6), (8pp). doi: 10.1126/sciadv.aay5981
Link to publisher’s version