SHARE
Facebook X Pinterest WhatsApp

NLP Technique Helps Predict Coronavirus Mutations

thumbnail
NLP Technique Helps Predict Coronavirus Mutations

Virus pandemic cells or bacteria molecule concept. Germs, bacteria, cell infected organism.

NLP helps researchers understand the virus because the immune system’s interpretation of protein signatures is akin to the brain’s interpretation of sentences.

Jan 26, 2021

In the latest saga of researchers using creative means to understand the novel Coronavirus, it looks like our understanding of language processing could contribute. According to Science Magazine, researchers may be able to use natural language processing (NLP) techniques to predict virus mutations.

Language and DNA: An unexpected match

Computational biologist Bonnie Berger calls it “the language of evolution” in the recent article. When she and her colleagues pull vital proteins together, NLP algorithms seem to predict the mutations that allow the virus to evade the body’s defenses.

This critical understanding gives us an advantage. In a perpetual battle between the human body (and medical teams) and the viruses that evade and take over, knowing the enemy’s moves before it makes it could give crucial lead time for healing.

See also: Natural Language Market to Surpass $40 Billion By 2025

Researchers believe that the immune system’s interpretation of protein signatures is similar to the brain’s interpretation of sentences. So, they applied the same principles.

Advertisement

How NLP works in this area of research

For one series, they used grammar concepts to determine how good the virus is at infecting the host. Successful viruses are “grammatically correct.” Unsuccessful viruses are not. And it looks like we can understand mutations of a virus in terms of semantics. A mutation changes the “meaning” of the virus, requiring different antibodies to read it.

Together, understanding the structure and meaning of a virus may finally unlock our ability not just to identify and treat it but to predict it. The neural networks reading these viruses trained on thousands of genetic sequences from HIV, influenza, and Sars-Cov-2.

The algorithms encoded genetic sequences using embedding, i.e., grouping based on mutation similarity. The approach hopes to identify mutations that help a virus evade the immune system without making it less infectious.

The results look promising. Scored on a metric identifying .5 as no better than chance and one as perfect, the models scored 0.69 for HIV to 0.85 for one coronavirus strain. While not ready to deploy, this looks better than the current state of the art models. In time, we might see this creative pairing revolutionize our research into immunology.

EW

Elizabeth Wallace is a Nashville-based freelance writer with a soft spot for data science and AI and a background in linguistics. She spent 13 years teaching language in higher ed and now helps startups and other organizations explain - clearly - what it is they do.

Recommended for you...

The Rise of Autonomous BI: How AI Agents Are Transforming Data Discovery and Analysis
Why the Next Evolution in the C-Suite Is a Chief Data, Analytics, and AI Officer
Digital Twins in 2026: From Digital Replicas to Intelligent, AI-Driven Systems
Real-time Analytics News for the Week Ending December 27

Featured Resources from Cloud Data Insights

The Difficult Reality of Implementing Zero Trust Networking
Misbah Rehman
Jan 6, 2026
Cloud Evolution 2026: Strategic Imperatives for Chief Data Officers
Why Network Services Need Automation
The Shared Responsibility Model and Its Impact on Your Security Posture
RT Insights Logo

Analysis and market insights on real-time analytics including Big Data, the IoT, and cognitive computing. Business use cases and technologies are discussed.

Property of TechnologyAdvice. © 2026 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.