Bacteriophage genomics: What has five years of INPHARED taught us?
Bacteriophage genomics: What has five years of INPHARED taught us?
Cook, R.; Rihtman, B.; Ponsero, A. J.; Michniewski, S.; Telatin, A.; Sicheritz-Ponten, T.; Adriaenssens, E. M.; Millard, A. D.
AbstractBacteriophages are key drivers of microbial ecology and evolution, and the rapid expansion of phage sequencing has created sustained demand for curated reference genome databases. We released the INfrastructure for a PHAge REference Database (INPHARED) in January 2021 to provide quality-controlled metadata for complete phage genomes from cultured isolates. Here, we compare the 2021 and 2026 snapshots, spanning a five-year period that included a substantial overhaul of bacterial virus taxonomy by the ICTV. The database has approximately doubled, from 14,244 to 28,777 genomes, yet the proportion representing novel species-level diversity has declined, indicating that redundant sequencing is outpacing new discovery. Host bias persists despite the addition of 97 new host genera. We have incorporated genome quality assessments, lifestyle predictions, and defence and anti-defence system annotations, providing an updated resource and a snapshot of the current state of phage genomics.