Scientists use AI to crack novel coronavirus genome signature
This new data discovery tool will allow researchers to quickly and easily classify a deadly virus like SARS-CoV-2 in just minutes, according to the researchers.
Toronto: Scientists, including one of Indian origin, have used artificial intelligence (AI) to identify an underlying genomic signature for 29 different DNA sequences of the novel coronavirus that causes COVID-19, providing an important tool for vaccine and drug developers.
This new data discovery tool will allow researchers to quickly and easily classify a deadly virus like SARS-CoV-2 in just minutes, according to the researchers, including Gurjit Randhawa from Western University in Canada.
It provides a process of high importance for strategic planning and mobilising medical needs during a pandemic, they said.
The study, published in the PLOS ONE journal, also supports the scientific hypothesis that SARS-CoV-2 virus that causes COVID-19 disease has its origin in bats as Sarbecovirus, a subgroup of Betacoronavirus.
The "ultra-fast, scalable, and highly accurate" classification system uses a new graphics-based, specialised software and decision-tree approach to illustrate the classification and arrive at the best choice out of all possible outcomes, the researchers said.
The machine-learning method achieves 100 per cent accurate classification of the novel coronavirus sequences and more importantly, discovers the most relevant relationships among more than 5,000 viral genomes within minutes, the researchers said.
Machine learning is an application of AI that provides systems with the ability to automatically learn and improve from experience without being explicitly programmed.
"All we needed was the COVID-19 DNA sequence to discover its own intrinsic sequence pattern," said Kathleen Hill, a professor at the University of Western Ontario in Canada.
"We used that signature pattern and a logical approach to match that pattern as close as possible to other viruses and achieved a fine level of classification in minutes -- not days, not hours but minutes," Hill said.
This classification tool has already been used to analyse more than 5,000 unique viral genomic sequences, including the 29 novel coronavirus sequences available on January 27, the researchers said.
Hill believes the tool will be an essential component in the toolkit for vaccine and drug developers, front-line health-care workers, researchers and scientists during this global pandemic and beyond.