Explore Innovative Open Models And Datasets For Enhanced Research And Development


NVIDIA Isaac GR00T N1 used in object manipulation.

At its annual GTC conference, NVIDIA made headlines by unveiling three groundbreaking open-source releases that are set to revolutionize the field of physical AI. These innovations include a new suite of World Foundation Models (WFMs) called Cosmos Transfer, a comprehensive Physical AI Dataset, and the first open model for general humanoid reasoning, the NVIDIA Isaac GR00T N1. Together, these resources empower developers to push the boundaries of robotics and autonomous vehicle technology.

Contents

New World Foundation Model – Cosmos Transfer

How it Works

Open Physical AI Dataset
Purpose Built Model for Humanoids – NVIDIA Isaac GR00T N1

Dual-System Architecture

Path Forward

New World Foundation Model – Cosmos Transfer

The Cosmos Transfer model represents a significant advancement in NVIDIA’s Cosmos™ world foundation models (WFMs). Boasting 7 billion parameters, this model offers unparalleled control and accuracy in generating virtual world scenes. It utilizes multicontrols to ensure high-fidelity outputs from various structural inputs, allowing for precise spatial alignment and scene composition.

How it Works

The effectiveness of Cosmos Transfer is rooted in its architecture, which involves training individual ControlNets for each sensor modality used to capture the simulated world.

Input types include:

3D bounding box maps
Trajectory maps
Depth maps
Segmentation maps

During inference, developers can employ a variety of structured visual or geometric data—such as edge maps, human motion keypoints, LiDAR scans, and HD maps—to guide the model’s output. The control signals from each branch are combined with adaptive spatiotemporal control maps and integrated into the transformer blocks of the base model, resulting in photorealistic video sequences that maintain controlled layouts and object placements.

The Cosmos Transfer model is particularly effective for generating synthetic data tailored for robotics and autonomous vehicle development, especially when paired with the NVIDIA Omniverse platform. Developers can explore numerous examples on GitHub, including samples specifically designed for autonomous vehicles.

Open Physical AI Dataset

In addition to Cosmos Transfer, NVIDIA has launched the Physical AI Dataset, an open-source resource available on Hugging Face. This extensive dataset comprises 15 terabytes of data, encapsulating more than 320,000 trajectories for robotics training, along with up to 1,000 Universal Scene Description (OpenUSD) assets, including a collection ready for simulation.

This dataset is particularly beneficial for developers utilizing post-training foundation models like Cosmos Predict, providing high-quality, diverse data essential for enhancing AI model performance. The dataset’s commercial-grade, pre-validated nature ensures that developers can rely on it for rigorous training and testing.

Purpose Built Model for Humanoids – NVIDIA Isaac GR00T N1

Among the standout announcements is NVIDIA Isaac GR00T N1, the first open foundation model specifically designed for generalized reasoning and skills in humanoid robots. Capable of processing multimodal inputs—including language and images—this model excels in performing manipulation tasks across various environments. The Isaac GR00T-N1-2B model is readily accessible on Hugging Face.

Trained on a vast humanoid dataset that combines real-world captured data, synthetic data generated from the NVIDIA Isaac GR00T Blueprint, and internet-scale video data, Isaac GR00T N1 is adaptable for specific embodiments, tasks, and environments. This versatility is achieved using a single model and set of weights, enabling it to perform complex manipulation behaviors on different humanoid robots like the Fourier GR-1 and 1X Neo.

The model showcases impressive generalization capabilities across a myriad of tasks, from grasping and manipulating objects to executing intricate multi-step tasks requiring sustained contextual understanding.

Dual-System Architecture

NVIDIA Isaac GR00T N1 features a dual-system architecture inspired by human cognitive processes, comprising:

Vision-Language Model (System 2): Based on NVIDIA-Eagle with SmolLM-1.7B, this model interprets environmental cues through vision and language, allowing robots to reason and plan actions accordingly.
Diffusion Transformer (System 1): This action model translates the planned actions from System 2 into precise movements, ensuring smooth and continuous robot operation.

Path Forward

The emphasis on post-training represents a crucial step forward in refining autonomous systems and developing specialized models tailored for downstream physical AI tasks. Developers are encouraged to explore the Cosmos Predict and Cosmos Transfer inference scripts available on GitHub, as well as the research papers detailing their functionalities.

The NVIDIA Isaac GR00T-N1-2B model can also be found on Hugging Face, alongside sample datasets and PyTorch scripts designed for post-training with custom user datasets, compatible with the Hugging Face LeRobot format. For further insights into the Isaac GR00T N1 model, the accompanying research paper offers comprehensive information.

Stay updated with NVIDIA’s latest advancements by following their developments on Hugging Face, paving the way for innovation in the realm of physical AI.

Inspired by: Source

Explore Innovative Open Models and Datasets for Enhanced Research and Development

New World Foundation Model – Cosmos Transfer

How it Works

Open Physical AI Dataset

Purpose Built Model for Humanoids – NVIDIA Isaac GR00T N1

Dual-System Architecture

Path Forward

Stay Connected

Explore Top AI Tools Instantly

Latest News

Pope Leo XIV Collaborates with Anthropic Co-Founder to Release Text on Human Dignity and Artificial Intelligence

LISTEN to Your Preferences: A Comprehensive LLM Framework for Effective Multi-Objective Selection

Poll Reveals One-Third of UK University Students Believe AI Job Losses Could Trigger Social Unrest

Key Google Updates and Announcements You Can Expect This Week

Leading global tech insights for 20M+ innovators

Quick Link

Support

Sign Up for Our Newsletter

New World Foundation Model – Cosmos Transfer

How it Works

More Read

Open Physical AI Dataset

Purpose Built Model for Humanoids – NVIDIA Isaac GR00T N1

Dual-System Architecture

Path Forward

Sign Up For Daily Newsletter

Get AI news first! Join our newsletter for fresh updates on open-source models.

Stay Connected

Explore Top AI Tools Instantly

Latest News

Pope Leo XIV Collaborates with Anthropic Co-Founder to Release Text on Human Dignity and Artificial Intelligence

LISTEN to Your Preferences: A Comprehensive LLM Framework for Effective Multi-Objective Selection

Poll Reveals One-Third of UK University Students Believe AI Job Losses Could Trigger Social Unrest

Key Google Updates and Announcements You Can Expect This Week