Health

Supercomputer helps Canadian researcher uncover thousands of viruses that could cause human diseases

Tracking pathogens that could spark future pandemics is no easy feat, but thanks to the help of a supercomputer, a Canadian researcher is among a team of scientists who’ve uncovered thousands of viruses that might one day pose a threat to humans.

Team discovered more than 130,000 RNA-based viruses, including 9 coronaviruses

The Serratus Project is a supercomputing collaboration that analyzed close to six million biological samples, or 20 million gigabytes of data. The goal is to search for a specific gene in each sample that would indicate the presence of an RNA-based virus. (Supplied by Artem Babaian)

Tracking pathogens that could spark future pandemics is no easy feat, but thanks to the help of a supercomputer, a Canadian researcher is among a team of scientists who've uncovered thousands of viruses that might one day pose a threat to humans.

Dubbed the Serratus Project, the international collaboration recently shared its findings in the journal Nature — which included the discovery of nearly 10 times more RNA-based viruses than were previously known, totalling more than 130,000 new species, all lurking in more than a decade's worth of publicly available genetic data.

Those types of pathogens are known for causing a wide variety of human diseases, ranging from COVID-19 to Ebola to the common cold. And this knowledge could "improve pathogen surveillance for the anticipation and mitigation of future pandemics," the team wrote in their paper, which was published at the end of January.

Artem Babaian, a former University of British Columbia (UBC) post-doctoral research fellow, spearheaded the work, which relied on cloud-based supercomputing power provided by Amazon Web Services in collaboration with UBC through the school's Cloud Innovation Centre. 

"We reanalyzed all public sequencing data — so this is genetic data from pretty much every corner of the planet you can think of," Babaian told CBC News. "It has soil samples from Vancouver… all the way down to anal swabs from penguins in Antarctica."

Artem Babaian, a former University of British Columbia (UBC) post-doctoral research fellow, spearheaded the work, which relied on cloud-based supercomputing power provided by Amazon Web Services in collaboration with UBC through the school’s Cloud Innovation Centre. (Supplied by Artem Babaian)

20 million gigabytes of data to analyze

There were close to six million biological samples in total, equalling 20 million gigabytes of data.

Babaian's supercomputer — a computer with a higher level of performance — scanned through all of them in under two weeks, searching for a specific gene that indicated the presence of an RNA-based virus.  

Babaian was once interested in using this kind of data analysis for cancer research, until the pandemic hit. Then hunting for coronaviruses — those in the same family as SARS-CoV-2 — became his top priority, and the team wound up finding nine new ones. 

That kind of global surveillance, including ongoing sample collection on the ground, will be crucial going forward, experts say, since viruses can wreak havoc when they spill back and forth between wildlife and humans.

The supercomputer scanned close to six million samples, including anal swabs from penguins, like this one seen in a 2011 file photo. (Mark Mitchell/New Zealand Herald/ The Associated Press)

"Having those surveillance systems in place provides us with a glimpse of what we could potentially be facing," said Jason Kindrachuk, an assistant professor in medical microbiology and infectious diseases at the University of Manitoba. "But we need to also appreciate that we have to put in the support systems and the infrastructure systems into those regions of the world to be able to do this for the long term."

"One of the challenges is to have a real network across the world where you can monitor both on the animals' side, but also on the human side," echoed Marcia Castro, chair of the department of global health and population at Harvard T.H. Chan School of Public Health.

But even with global scientists working hard to find and share new viruses, it's still tough predicting which pathogens could become global threats.

A world map showing the locations of millions of biological data samples analyzed by the Serratus Project. (Supplied by the Serratus Project)

Project includes massive global database

Babaian is hopeful his team's work might help change that. 

"Can we know what these viral agents are earlier, so that if it is something that has the potential for sustained human-to-human transmission, you can recognize it quickly and then have a public health measure, like quarantining a few individuals — stopping an outbreak before it becomes a pandemic?" he said.

The Serratus Project — named after Serratus Mountain in British Columbia, which Babaian saw on a hike in 2020 — included the creation of a massive public database for global scientists and medical teams to access.

WATCH | How a Canadian researcher uncovered more than 130,000 RNA-based viruses:

This supercomputer helps track possible pandemics

3 years ago
Duration 2:04
A Canadian researcher, with the help of a supercomputer, has discovered thousands of viruses, including nine coronaviruses.

Babaian said that down the road, it could help scientists connect the dots between strange emerging illnesses and viruses from abroad. 

Take the example of someone in Canada having a fever of unknown origin, he explained. One day it could be possible to take the sequencing data from that patient's blood and link it to, say, a virus found in a camel from Africa that was sampled a decade ago.

But it's easier said than done, he added.

"The real challenge, I think, as a global community, will be trying to make sure that we can get this [technology] to places where these spillover events are happening more frequently," Babaian said.

ABOUT THE AUTHOR

Lauren Pelley

Senior Health & Medical Reporter

Lauren Pelley covers the global spread of infectious diseases, pandemic preparedness and the crucial intersection between health and climate change. She's a two-time RNAO Media Award winner for in-depth health reporting in 2020 and 2022, a silver medallist for best editorial newsletter at the 2024 Digital Publishing Awards, and a 2024 Covering Climate Now award winner in the health category. Contact her at: lauren.pelley@cbc.ca.

Add some “good” to your morning and evening.

A vital dose of the week's news in health and medicine, from CBC Health. Delivered to your inbox every Saturday morning.

...

The next issue of CBC Health's Second Opinion will soon be in your inbox.

Discover all CBC newsletters in the Subscription Centre.opens new window

This site is protected by reCAPTCHA and the Google Privacy Policy and Google Terms of Service apply.