The Das Lab at Stanford: Pioneering RNA Folding Research
The Das Lab at Stanford University is redefining the landscape of RNA folding research through innovative methodologies that harness community engagement and cutting-edge computing technology. With the backing of NVIDIA DGX Cloud via the NAIRR Pilot program, the lab has gained access to an impressive array of 32 NVIDIA A100 DGX Cloud nodes, each equipped with eight GPUs. This powerful infrastructure has enabled the team to shift from small-scale experiments to expansive large-scale distributed training, significantly advancing the field of RNA folding research.
At the helm of these groundbreaking initiatives is Dr. Rhiju Das, whose team has made significant strides in RNA research. Notably, they hosted the OpenVaccine Kaggle competition in 2020 to address challenges posed by the COVID-19 pandemic. Additionally, the upcoming Ribonanza competition in 2024 aims to further propel investigations in RNA folding. The overarching objective of these efforts is to enhance the understanding and practical applications of biological science by creating accurate models of RNA structure and function.
Challenges in RNA Folding Research
One of the most significant hurdles in developing effective RNA folding models is the scarcity of experimental RNA structure data. Unlike the extensive protein structure databases that facilitate the training of models like AlphaFold2, RNA research lacks similar resources. This gap presents a formidable challenge for researchers in the field who strive to construct reliable models that can accurately predict RNA folding.
Eterna: A Community-Driven Solution
To tackle this challenge, the Das Lab innovated by developing Eterna, a unique game designed to engage the community in generating novel RNA sequences. Participants in Eterna contribute their creativity and insights, which the lab synthesizes in-house. Following synthesis, chemical mapping experiments are conducted to gather data that helps infer the folded structures of the RNA. This collaborative effort not only enriches the lab’s database but also empowers individuals to play a role in scientific discovery.
Strategic Approach to RNA Folding Research
The Das Lab employs a multi-faceted strategy to expedite RNA folding research, focusing on several key components:
- Crowdsourced Data Curation: By leveraging the Eterna game, the lab gathers novel RNA sequences and combines them with other expertly curated databases.
- Approximating RNA Structure Data: Chemical mapping experiments generate reactivity profiles of synthesized RNA, providing valuable insights based on community-generated sequences.
- Crowdsourced Model Design: The lab utilizes Kaggle competitions to explore various model architectures and training pipelines, inviting community collaboration.
In conjunction with crowdsourced data, the Das Lab employs additional methods for generating synthetic designs. One notable approach involves training a model using reinforcement learning techniques to achieve human-level performance in the Eterna game, thereby expediting the generation of novel RNA sequences. This model was developed over 4,000 GPU hours on NVIDIA DGX Cloud, utilizing the Q-learning algorithm.
Building on the successes of last year’s Ribonanza competition, the Das Lab introduced a new model, RibonanzaNet, which surpassed all previous solutions. Furthermore, they expanded their training database from 210,000 to an impressive 40 million RNA sequences and chemical reactivity profiles. With the computational prowess of NVIDIA DGX Cloud, the lab is poised to conduct large-scale distributed training, experiment with diverse model architectures, and optimize training hyperparameters.
Remarkable Results in RNA Folding
The Das Lab has successfully built the largest database for training RNA structure models. Utilizing this extensive dataset, they trained foundation models on 256 A100 GPUs, leveraging the advancements of RibonanzaNet. Their latest innovation, RibonanzaNet2, boasts 100 million parameters and achieves state-of-the-art performance in secondary-structure modeling. This model is now available for community fine-tuning, promoting collaborative research and continuous improvement.
On February 26, 2025, the Das Lab will launch the Stanford RNA 3D Folding Kaggle competition, offering a total of $75,000 in prizes for the top three teams. This competition will run for three months and challenges participants to fine-tune RibonanzaNet2 for downstream structure prediction, with evaluations based on experimental RNA structures collected during the competition period.
For those eager to dive into fine-tuning RibonanzaNet2 in the Kaggle competition, resources are available through the RibonanzaNet2 alpha release forum post and the model release. Additionally, a six-part announcement detailing the release of RibonanzaNet2 can be found on social media, encouraging widespread participation.
How to Get Involved in RNA Folding Research
The achievements of the Das Lab highlight the power of crowdsourcing and collaborative research, particularly when paired with accelerated computing technologies in advancing RNA folding and biological science. As they continue to expand model and dataset sizes, as well as computing resources through NVIDIA DGX Cloud, the potential for groundbreaking discoveries remains high.
If you’re interested in contributing to the advancement of AI foundation models for RNA, consider joining the Stanford RNA 3D Folding Kaggle competition and help fine-tune RibonanzaNet2. This initiative not only fosters innovation but also allows participants to be part of a community dedicated to solving one of biology’s greatest challenges.
For more information about the Das Lab’s research, check out the following resources:
Inspired by: Source

