Most of the emerging infectious diseases that threaten humans – including coronaviruses – are zoonotic, meaning they originate in another animal species. And as population sizes soar and urbanisation expands, encounters with creatures harbouring potentially dangerous diseases are becoming ever more likely.
Identifying these viruses early, then, is becoming vitally important. A new study out today in PLOS Biology from a team of researchers at the University of Glasgow, UK, has identified a novel way to do this kind of viral detective work, using machine learning to predict the likelihood of a virus jumping to humans.
According to the researchers, a major stumbling block for understanding zoonotic disease has been that scientists tend to prioritise well-known zoonotic virus families based on their common features. This means that there is potentially myriad viruses unrelated to known zoonotic diseases that have not been discovered, or are not well known, which may hold zoonotic potential – the ability to make the species leap.
More reading: Cosmos Q&A: Predicting the next pandemic
In order to circumvent this problem, the team developed a machine learning algorithm that could infer the zoonotic potential of a virus from its genome sequence alone, by identifying characteristics that link it to humans, rather than looking at taxonomic relationships between the virus being studied and existing zoonotic viruses.
The team found that viral genomes may have generalisable features that enable them to infect humans, but which are not necessarily taxonomically closely related to other human-infecting viruses. They say this approach may present a novel opportunity for viral sleuthing.
“By highlighting viruses with the greatest potential to become zoonotic, genome-based ranking allows further ecological and virological characterisation to be targeted more effectively,” the authors write.
“These findings add a crucial piece to the already surprising amount of information that we can extract from the genetic sequence of viruses using AI techniques,” says co-author Simon Babayan.
“A genomic sequence is typically the first, and often only, information we have on newly discovered viruses, and the more information we can extract from it, the sooner we might identify the virus’s origins and the zoonotic risk it may pose.
“As more viruses are characterised, the more effective our machine learning models will become at identifying the rare viruses that ought to be closely monitored and prioritised for pre-emptive vaccine development.”