NVIDIA and Arc Institute Unveil an AI Model to Predict DNA, RNA & Proteins

The model has been trained on nearly 9 trillion nucleotides, the building blocks of DNA and RNA.
NVIDIA and Arc Institute Unveil an AI Model to Predict DNA, RNA & Proteins

California-based nonprofit Arc Institute and Stanford University, in collaboration with NVIDIA, unveiled Evo 2 on Wednesday as the largest publicly available AI model for genomic data. Evo 2 can predict and design the genetic code—DNA, RNA, and proteins—of all domains of life. 

The model has been trained on nearly 9 trillion nucleotides, the building blocks of DNA and RNA. “We make Evo 2 fully open, including model parameters, training code, inference code, and the OpenGenome2 dataset, to accelerate the exploration and design of biological complexity,” the researchers said in the official paper.

“Deploying a model like Evo 2 is like sending a powerful new telescope out to the farthest reaches of the universe,” said Dave Burke, Arc’s chief technology officer. “We know there’s immense opportunity for exploration, but we don’t yet know what we’re going to discover.”

NVIDIA said the model can be used for biomolecular research applications, including predicting protein structures, identifying novel molecules for healthcare and industrial use, and evaluating how gene mutations affect function.

“Evo 2 represents a major milestone for generative genomics,” said Patrick Hsu, Arc Institute cofounder and core investigator, and an assistant professor of bioengineering at the University of California, Berkeley. “By advancing our understanding of these fundamental building blocks of life, we can pursue solutions in healthcare and environmental science that are unimaginable today.”

The model is available as an NVIDIA NIM microservice, allowing users to generate biological sequences with customisable settings. Researchers can also fine-tune Evo 2 on proprietary datasets through the open-source NVIDIA BioNeMo Framework.

“Designing new biology has traditionally been a laborious, unpredictable and artisanal process,” said Brian Hie, assistant professor of chemical engineering at Stanford University and Arc Institute innovation investigator. “With Evo 2, we make biological design of complex systems more accessible to researchers, enabling the creation of new and beneficial advances in a fraction of the time it would previously have taken.”

Arc Institute, founded in 2021 with $650 million in funding, supports long-term scientific research by providing multiyear funding and dedicated lab space. Scientists at the institute focus on disease areas, including cancer, immune dysfunction, and neurodegeneration.

NVIDIA contributed computing resources by providing access to 2,000 NVIDIA H100 GPUs via NVIDIA DGX Cloud on AWS. The AI platform includes NVIDIA BioNeMo software, featuring optimised microservices and BioNeMo Blueprints. NVIDIA researchers also collaborated on AI scaling and optimisation.

Evo 2 processes genetic sequences up to 1 million tokens in length, enabling a broader analysis of the genome. This capability allows scientists to explore relationships between genetic sequences and cell function, gene expression, and disease.

“A single human gene contains thousands of nucleotides—so for an AI model to analyse how such complex biological systems work, it needs to process the largest possible portion of a genetic sequence at once,” said Hsu.

In healthcare and drug discovery, Evo 2 could help researchers identify gene variants linked to specific diseases and design molecules that precisely target them. In a separate study by Stanford and Arc Institute, researchers found that Evo 2 could predict with 90% accuracy whether previously unrecognised mutations in BRCA1, a gene associated with breast cancer, would affect gene function.

In agriculture, the model could support food security efforts by improving understanding of plant biology, leading to the development of climate-resilient or nutrient-dense crops. Evo 2 could also be used to engineer biofuels or proteins that break down plastic or oil.

📣 Want to advertise in AIM? Book here

Picture of Siddharth Jindal

Siddharth Jindal

Siddharth is a media graduate who loves to explore tech through journalism and putting forward ideas worth pondering about in the era of artificial intelligence.
Related Posts
Association of Data Scientists
GenAI Corporate Training Programs
Our Upcoming Conference
India's Biggest Conference on AI Startups
April 25, 2025 | 📍 Hotel Radisson Blu, Bengaluru
Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.