Unveiling Holo3: The Future of the Autonomous Enterprise
We are thrilled to introduce Holo3, the latest advancement in our vision for the Autonomous Enterprise. With an impressive score of 78.85% on the OSWorld-Verified benchmark, Holo3-122B-A10B sets a new industry standard for desktop computer use, making it a game-changer in the landscape of artificial intelligence and automation.
Beyond a Benchmark Leader
Holo3 is not just a benchmark leader; it is meticulously engineered for real-world production. Utilizing our innovative agentic flywheel, this model has been trained to execute real-world workflows within simulated enterprise environments. This unique design ensures that Holo3 excels not only in current business scenarios but also lays the groundwork for future agents capable of autonomously navigating virtually any digital landscape.
Furthermore, Holo3 accomplishes these feats with only 10B active parameters (out of a total of 122B), proving that you don’t need massive models like GPT 5.4 or Opus 4.6 to achieve industry-leading results. All models are available via our Inference API, and the Holo3-35B-A3B weights can be found openly on Hugging Face under the Apache2 license, accessible through a free tier of our inference API.
The Agentic Learning Flywheel
What truly differentiates Holo3 is its specialized training pipeline—a continuous feedback loop that sharpens two core components: perception and decision-making.
Our training flywheel focuses on teaching the model specific tasks using annotated examples, all while developing generalist skills across a virtually infinite range of user interfaces. Here’s a breakdown of our method for building world-class computer-use models:
- Synthetic Navigation Data: We generate scenario-specific navigation examples using a mix of human and generated instructions.
- Out-of-Domain Augmentation: Scenarios are programmatically extended and augmented, ensuring that Holo3 is prepared for unexpected situations.
- Curated Reinforcement Learning: Every data sample is meticulously curated and ingested through a pipeline enhanced by advanced data filtering and reinforcement learning aimed at maximizing performance.
The OSWorld results serve as robust proof-of-concept for our learning flywheel, reaffirming its transferability to real-world business applications through our Synthetic Environment Factory.
Exploring the Synthetic Environment Factory & H Corporate Benchmarks
Our Synthetic Environment Factory recreates the complexity of enterprise systems and serves as one of the main training areas that shaped Holo3. The environments are automatically constructed using coding agents that can program websites from scratch, based on specific scenario specifications. This process produces verifiable tasks of varying difficulty, validated end-to-end with rigorous verification scripts.
To assess real-world readiness, we designed the H Corporate Benchmarks, a dedicated evaluation suite comprising 486 multi-step realistic tasks across four categories: E-commerce, Business software, Collaboration, and Multi-App setups. This benchmark covers the entire complexity spectrum ranging from focused, single-application tasks to intricate, long-horizon, multi-application workflows that simulate how work truly gets done.
For instance, the more challenging tasks in the Multi-App category necessitate the agent to coordinate information across various systems simultaneously. A real-world example involves retrieving equipment prices from a PDF, cross-referencing them against each employee’s remaining budget, and seamlessly sending personalized approval or rejection emails. Accomplishing such tasks requires not only precise calculations and document parsing but also sustained multi-step reasoning across applications without losing state or intent.
Examples of synthetic environments created for training Holo3:
The performance metrics below showcase Holo3 outperforming its competitors on single application benchmarks. The noticeable performance gap between Holo3 and base Qwen3.5 models underscores the superiority of our agentic learning approach. Holo3 achieves higher success rates compared to models with considerably more parameters, while maintaining equivalent localization and grounding standards.
Holistic Progress Towards Universal Agency
Although Holo3 marks a significant milestone, it is merely the beginning of a much larger journey. By crafting a system that can see, reason, and act within our clients’ digital platforms, we are bringing the vision of the Autonomous Enterprise closer to reality.
As our Synthetic Environment Factory continues to evolve, our agents are progressively learning to tackle more complex tasks. While Holo3 excels in mastering existing interfaces, we are already laying the groundwork for the next frontier: Adaptive Agency. This future phase aims to empower our models to autonomously learn and navigate entirely new, bespoke enterprise software in real-time, further advancing the capabilities of AI in enterprise settings.
With innovations like Holo3, we are not just reshaping how businesses operate; we are pioneering a future where automation and AI harmoniously integrate to drive productivity and efficiency.
Inspired by: Source


