By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
AIModelKitAIModelKitAIModelKit
  • Home
  • News
    NewsShow More
    China’s Five-Year Plan: Key Targets for AI Implementation and Development
    China’s Five-Year Plan: Key Targets for AI Implementation and Development
    6 Min Read
    How Meta’s Natural Gas Expansion Could Energize South Dakota
    How Meta’s Natural Gas Expansion Could Energize South Dakota
    5 Min Read
    Claude’s Code: Anthropic Reveals Source Code for AI Software Engineering Tool | Tech Update
    Claude’s Code: Anthropic Reveals Source Code for AI Software Engineering Tool | Tech Update
    5 Min Read
    Anthropic Accidentally Removes Thousands of GitHub Repositories in Effort to Retrieve Leaked Source Code
    Anthropic Accidentally Removes Thousands of GitHub Repositories in Effort to Retrieve Leaked Source Code
    4 Min Read
    Enhance Your Stream Deck Experience: How AI Can Automate Your Button Presses
    Enhance Your Stream Deck Experience: How AI Can Automate Your Button Presses
    4 Min Read
  • Open-Source Models
    Open-Source ModelsShow More
    Pioneering the Future of Computer Use: Expanding Digital Frontiers
    Pioneering the Future of Computer Use: Expanding Digital Frontiers
    5 Min Read
    Protecting Cryptocurrency: How to Responsibly Disclose Quantum Vulnerabilities
    Protecting Cryptocurrency: How to Responsibly Disclose Quantum Vulnerabilities
    4 Min Read
    Boosting AI and XR Prototyping Efficiency with XR Blocks and Gemini
    Boosting AI and XR Prototyping Efficiency with XR Blocks and Gemini
    5 Min Read
    Transforming News Reports into Data Insights with Gemini: A Comprehensive Guide
    Transforming News Reports into Data Insights with Gemini: A Comprehensive Guide
    6 Min Read
    Enhancing Urban Safety: AI-Powered Flash Flood Forecasting Solutions for Cities
    Enhancing Urban Safety: AI-Powered Flash Flood Forecasting Solutions for Cities
    5 Min Read
  • Guides
    GuidesShow More
    Mastering Keywords in Python: A Comprehensive Quiz | Real Python
    Mastering Keywords in Python: A Comprehensive Quiz | Real Python
    4 Min Read
    Top 7 AI Website Builders: Transforming Ideas into Live Sites Effortlessly
    Top 7 AI Website Builders: Transforming Ideas into Live Sites Effortlessly
    6 Min Read
    Master Test-Driven Development with pytest: Take the Real Python Quiz
    Master Test-Driven Development with pytest: Take the Real Python Quiz
    24 Min Read
    How to Add Python to PATH: A Step-by-Step Guide – Real Python
    How to Add Python to PATH: A Step-by-Step Guide – Real Python
    5 Min Read
    Mastering Jupyter Notebooks: Quiz Challenges on Real Python
    Mastering Jupyter Notebooks: Quiz Challenges on Real Python
    4 Min Read
  • Tools
    ToolsShow More
    High Throughput Computer Use Agent: Understanding 12B for Optimal Performance
    High Throughput Computer Use Agent: Understanding 12B for Optimal Performance
    5 Min Read
    Introducing the First Comprehensive Healthcare Robotics Dataset and Essential Physical AI Models for Advancing Healthcare Robotics
    Introducing the First Comprehensive Healthcare Robotics Dataset and Essential Physical AI Models for Advancing Healthcare Robotics
    6 Min Read
    Creating Native Multimodal Agents with Qwen 3.5 VLM on NVIDIA GPU-Accelerated Endpoints
    Creating Native Multimodal Agents with Qwen 3.5 VLM on NVIDIA GPU-Accelerated Endpoints
    5 Min Read
    Discover SyGra Studio: Your Gateway to Exceptional Creative Solutions
    Discover SyGra Studio: Your Gateway to Exceptional Creative Solutions
    6 Min Read
    Maximizing Power Efficiency in AI Manufacturing with NVIDIA Spectrum-X Ethernet Photonics
    Maximizing Power Efficiency in AI Manufacturing with NVIDIA Spectrum-X Ethernet Photonics
    5 Min Read
  • Events
    EventsShow More
    Developing a Comprehensive Four-Part Professional Development Series on AI Education
    Developing a Comprehensive Four-Part Professional Development Series on AI Education
    6 Min Read
    NVIDIA and Thinking Machines Lab Forge Strategic Gigawatt-Scale Partnership for Long-Term Innovation
    NVIDIA and Thinking Machines Lab Forge Strategic Gigawatt-Scale Partnership for Long-Term Innovation
    5 Min Read
    ABB Robotics Utilizes NVIDIA Omniverse for Scalable Industrial-Grade Physical AI Solutions
    ABB Robotics Utilizes NVIDIA Omniverse for Scalable Industrial-Grade Physical AI Solutions
    5 Min Read
    Urgent: Upcoming Title II Accessibility Deadline—Essential Information You Need to Know
    Urgent: Upcoming Title II Accessibility Deadline—Essential Information You Need to Know
    5 Min Read
    error code: 524
    error code: 524
    5 Min Read
  • Ethics
    EthicsShow More
    Explore an Interactive Tool for Understanding Dialectal Bias in Automated Toxicity Models
    Explore an Interactive Tool for Understanding Dialectal Bias in Automated Toxicity Models
    5 Min Read
    What ChatGPT Got Wrong: A Review of WIRED’s Top Recommendations
    What ChatGPT Got Wrong: A Review of WIRED’s Top Recommendations
    5 Min Read
    California Set to Enforce New AI Regulations Despite Trump’s Opposition
    California Set to Enforce New AI Regulations Despite Trump’s Opposition
    5 Min Read
    Australia’s New Military AI Policy: Key Timing and the Challenge of Implementation
    Australia’s New Military AI Policy: Key Timing and the Challenge of Implementation
    5 Min Read
    How Geopolitics is Influencing AI Research: Understanding the Interconnection
    How Geopolitics is Influencing AI Research: Understanding the Interconnection
    5 Min Read
  • Comparisons
    ComparisonsShow More
    How Structured Prompts Enhance Language Model Evaluation: An Analysis of [2511.20836]
    How Structured Prompts Enhance Language Model Evaluation: An Analysis of [2511.20836]
    5 Min Read
    Revolutionary Instruction-Free Framework for Low-Latency Next Edit Suggestions Using Historical Editing Trajectories
    Revolutionary Instruction-Free Framework for Low-Latency Next Edit Suggestions Using Historical Editing Trajectories
    6 Min Read
    How Community Size Outperforms Grammatical Complexity in Predicting Large Language Model Accuracy in a Novel Wug Test
    How Community Size Outperforms Grammatical Complexity in Predicting Large Language Model Accuracy in a Novel Wug Test
    5 Min Read
    Optimizing Policies with Future-KL for Enhanced Deep Reasoning Techniques
    Optimizing Policies with Future-KL for Enhanced Deep Reasoning Techniques
    5 Min Read
    Enhancing Spatial Mental Modeling with Limited Visual Perspectives
    Enhancing Spatial Mental Modeling with Limited Visual Perspectives
    5 Min Read
Search
  • Privacy Policy
  • Terms of Service
  • Contact Us
  • FAQ / Help Center
  • Advertise With Us
  • Latest News
  • Model Comparisons
  • Tutorials & Guides
  • Open-Source Tools
  • Community Events
© 2025 AI Model Kit. All Rights Reserved.
Reading: Introducing the First Comprehensive Healthcare Robotics Dataset and Essential Physical AI Models for Advancing Healthcare Robotics
Share
Notification Show More
Font ResizerAa
AIModelKitAIModelKit
Font ResizerAa
  • 🏠
  • 🚀
  • 📰
  • 💡
  • 📚
  • ⭐
Search
  • Home
  • News
  • Models
  • Guides
  • Tools
  • Ethics
  • Events
  • Comparisons
Follow US
  • Latest News
  • Model Comparisons
  • Tutorials & Guides
  • Open-Source Tools
  • Community Events
© 2025 AI Model Kit. All Rights Reserved.
AIModelKit > Tools > Introducing the First Comprehensive Healthcare Robotics Dataset and Essential Physical AI Models for Advancing Healthcare Robotics
Tools

Introducing the First Comprehensive Healthcare Robotics Dataset and Essential Physical AI Models for Advancing Healthcare Robotics

aimodelkit
Last updated: March 17, 2026 1:01 am
aimodelkit
Share
Introducing the First Comprehensive Healthcare Robotics Dataset and Essential Physical AI Models for Advancing Healthcare Robotics
SHARE

Authors: Nigel Nelson, Lukas Zbinden, Mostafa Toloui, Sean Huver

Healthcare AI has primarily revolved around perception-based models, concentrating on interpreting signals to classify or segment pathologies and anatomy. Yet, healthcare fundamentally encompasses “doing,” rendering past perception-only datasets inadequate due to their static nature, which doesn’t account for embodiment, contact dynamics, or closed-loop control. The healthcare sector calls for standardized robotic bodies, synchronized vision–force–kinematics data, sim-to-real pairing, and cross-embodiment benchmarks to establish a solid foundation for Physical AI.

1. Open-H-Embodiment

Open-H-Embodiment is a collaborative, community-driven dataset initiative aimed at creating a shared foundation essential for training and evaluating AI autonomy and world foundation models in surgical robotics and ultrasound applications. Spearheaded by a steering committee featuring notable figures such as Prof. Axel Krieger from Johns Hopkins, Prof. Nassir Navab from the Technical University of Munich, and Dr. Mahdi Azizian from NVIDIA, this initiative has now expanded to include participation from over 35 organizations worldwide.

Collectively, these participants have joined forces to construct the first large-scale dataset aimed at propelling the advancement of Physical AI within healthcare robotics.

Participants

Notable participants include:

  • Balgrist
  • CMR Surgical
  • The Chinese University of Hong Kong
  • Great Bay University
  • Hong Kong Baptist University
  • Hamlyn
  • ImFusion
  • Johns Hopkins University
  • Leeds University
  • Mohamed bin Zayed University of Artificial Intelligence
  • Moon Surgical
  • NVIDIA
  • Northwell Health
  • Obuda University
  • The Hong Kong Polytechnic University
  • Qilu Hospital of Shandong University
  • Rob Surgical
  • Sanoscience
  • Surgical Data Science Collective
  • Semaphor Surgical
  • Stanford
  • Dresden University of Technology
  • Technical University of Munich
  • Tuodao
  • Turin
  • University of British Columbia
  • UC Berkeley
  • UC San Diego
  • University of Illinois Chicago
  • University of Tennessee
  • University of Texas
  • Vanderbilt
  • Virtual Incision

The Dataset

  • Comprises 778 hours of CC-BY-4.0 healthcare robotics training data, primarily focused on surgical robotics, along with ultrasound and colonoscopy autonomy data.
  • Includes simulations, benchtop exercises (such as suturing), and actual clinical procedures.
  • Utilizes both commercial robots (like CMR Surgical, Rob Surgical, and Tuodao) and research robots (including dVRK, Franka, and Kuka).
  • Accompanied by the release of two new, permissively open-source models trained on this dataset.

2. GR00T-H: Vision Language Action Model for Surgical Robotics

One of the significant innovations birthed from this initiative is GR00T-H, a derivative of NVIDIA’s Isaac GR00T N series of Vision-Language-Action (VLA) models. With training based on approximately 600 hours of Open-H-Embodiment data, GR00T-H is pioneering as the first policy model tailored for surgical robotics tasks.

Leveraging NVIDIA’s open-source ecosystem, Gr00T-H utilizes Cosmos Reason 2 2B as its Vision-Language Model (VLM) backbone.

pyramid

Architectural Design Choices

Developing surgical robotics calls for acute precision, and specialized hardware like cable-driven systems complicates imitation learning (IL). To tackle this, GR00T-H incorporates four pivotal design choices:

  • Unique Embodiment Projectors: A distinct, learnable MLP maps each robot’s specific kinematics to a uniform, normalized action space.
  • State Dropout (100%): Proprioceptive input is dropped during inference, generating a learned bias term for every system, which enhances real-world results.
  • Relative EEF Actions: Training employs a common relative End-Effector (EEF) action space to mitigate kinematic inconsistencies.
  • Metadata in Task Prompts: Directly injects instrument names and control index mapping into the VLM task prompt.

A prototype of GR00T-H has successfully executed a complete, end-to-end suture as demonstrated in the SutureBot benchmark, showcasing robust long-horizon dexterity.

gr00t_sutureGR00T-H performing end-to-end suturing.


3. Cosmos-H-Surgical-Simulator

Another groundbreaking creation is the Cosmos-H-Surgical-Simulator, designed as a World Foundation Model (WFM) for action-conditioned surgical robotics. Traditional simulators have struggled due to the complexities of real-world conditions, such as soft tissue, reflections, blood, and smoke.

Key Capabilities

  • Overcoming the Sim-to-Real Gap: Fine-tuned from NVIDIA Cosmos Predict 2.5 2B, it generates physically plausible surgical video directly from kinematic actions.
  • Efficiency Gains: Completing 600 rollouts took only 40 minutes in simulation compared to 2 days required for real-world benchtop methods.
  • WFM as a Physics Simulator: This model learns tissue deformation and tool interaction implicitly from data.
  • Synthetic Data Generation: Capable of generating realistic synthetic video-action pairs to enhance underrepresented datasets.

cosmos_h_surg_sim

Fine-Tuning Details

The model underwent fine-tuning using the Open-H-Embodiment dataset (utilizing 9 robot embodiments across 32 datasets), employing 64x A100 GPUs over approximately 10,000 GPU-hours and utilizing a unified 44-dimensional action space.


4. What is Next: Towards Reasoning For Surgical Robotics

Looking ahead, the goal for version 2 of the Open-H-Embodiment initiative is to transition from mere perceptual control to the development of reasoning-capable autonomy—a significant leap reminiscent of a surgical robotics ChatGPT moment—where systems can explain, plan, and adapt throughout long procedures. Achieving this goal necessitates extending Open-H-Embodiment into reasoning-ready data, enriched with annotated task traces that capture intents, outcomes, and failure modes. This transformative effort urges community engagement, and we invite you to participate. For more details, visit our Open-H GitHub Repo to help shape the future of healthcare robotics.


5. Get started today

Ready to dive in? Access the following resources to start working with the Open-H-Embodiment dataset and models:

Inspired by: Source

Contents
  • 1. Open-H-Embodiment
    • Participants
    • The Dataset
  • 2. GR00T-H: Vision Language Action Model for Surgical Robotics
    • Architectural Design Choices
  • 3. Cosmos-H-Surgical-Simulator
    • Key Capabilities
    • Fine-Tuning Details
  • 4. What is Next: Towards Reasoning For Surgical Robotics
  • 5. Get started today
Discover the Winners of the 2025 PyTorch Startup Showcase: Celebrating Innovation in AI
Enhancing Literature Review and Target Discovery: The NVIDIA Biomedical AI-Q Research Agent Blueprint
Enhancing Large-Scale LLM Deployment with PyTorch: A Comprehensive Guide
Judge Arena: Evaluating LLM Performance Through Benchmarking
Accelerating Energy Modeling Applications with OpenSynth and PyTorch: A Deep Dive into Enhanced Compute Solutions

Sign Up For Daily Newsletter

Get AI news first! Join our newsletter for fresh updates on open-source models.

By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Copy Link Print
Previous Article Warren Urges Pentagon to Explain xAI’s Access to Classified Networks: Key Concerns Raised Warren Urges Pentagon to Explain xAI’s Access to Classified Networks: Key Concerns Raised
Next Article Ensure Consistent Dataset for Comprehensive Peer Review and Multi-Turn Rebuttal Discussions Ensure Consistent Dataset for Comprehensive Peer Review and Multi-Turn Rebuttal Discussions

Stay Connected

XFollow
PinterestPin
TelegramFollow
LinkedInFollow

							banner							
							banner
Explore Top AI Tools Instantly
Discover, compare, and choose the best AI tools in one place. Easy search, real-time updates, and expert-picked solutions.
Browse AI Tools

Latest News

How Structured Prompts Enhance Language Model Evaluation: An Analysis of [2511.20836]
How Structured Prompts Enhance Language Model Evaluation: An Analysis of [2511.20836]
Comparisons
China’s Five-Year Plan: Key Targets for AI Implementation and Development
China’s Five-Year Plan: Key Targets for AI Implementation and Development
News
Revolutionary Instruction-Free Framework for Low-Latency Next Edit Suggestions Using Historical Editing Trajectories
Revolutionary Instruction-Free Framework for Low-Latency Next Edit Suggestions Using Historical Editing Trajectories
Comparisons
Explore an Interactive Tool for Understanding Dialectal Bias in Automated Toxicity Models
Explore an Interactive Tool for Understanding Dialectal Bias in Automated Toxicity Models
Ethics
//

Leading global tech insights for 20M+ innovators

Quick Link

  • Latest News
  • Model Comparisons
  • Tutorials & Guides
  • Open-Source Tools
  • Community Events

Support

  • Privacy Policy
  • Terms of Service
  • Contact Us
  • FAQ / Help Center
  • Advertise With Us

Sign Up for Our Newsletter

Get AI news first! Join our newsletter for fresh updates on open-source models.

AIModelKitAIModelKit
Follow US
© 2025 AI Model Kit. All Rights Reserved.
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?