Google Cloud’s Hierarchical Namespace: A Game Changer for AI and Machine Learning Workloads
On March 17, 2025, Google Cloud made a significant announcement that could revolutionize the way businesses handle their data in the cloud. The introduction of the Hierarchical Namespace (HNS) feature in Cloud Storage is specifically designed to optimize artificial intelligence (AI) and machine learning (ML) workloads. This new feature aims to enhance data organization, performance, and reliability, addressing some of the longstanding challenges faced by data scientists and engineers.
Streamlining Checkpointing with Hierarchical Namespace
In AI and ML processes, particularly during model training, checkpointing plays a critical role. Checkpointing involves saving the model’s state at various intervals to ensure that progress is not lost in case of interruptions. Traditional flat namespace storage systems struggle with this, as they require rewriting or deleting each object individually when renaming folders. This process can be laborious and prone to errors.
With the introduction of HNS, Google Cloud Storage now supports atomic folder-level operations that make checkpointing faster and more reliable. The new RenameFolder API carries out metadata-only operations, allowing these tasks to be completed significantly quicker than with flat namespace buckets. According to Google’s benchmarks, HNS can accelerate checkpoint writes by up to 20 times when compared to traditional methods.
Real-World Benefits of HNS
Companies already using the Hierarchical Namespace feature have reported impressive results. For instance, AssemblyAI, a leading provider of AI-driven speech recognition, noted a remarkable 10x increase in throughput when utilizing HNS with Cloud Storage FUSE. This improvement translated into a staggering 15x enhancement in training speed, demonstrating the tangible benefits of adopting this innovative feature.
Enhanced Performance for AI/ML Workloads
The advantages of HNS extend beyond just checkpointing. It also optimizes the overall storage layout, allowing for higher queries per second (QPS) for both read and write operations. This optimization is particularly crucial for AI and ML workloads that run on large clusters. In scenarios where synchronized I/O operations can lead to bottlenecks, hierarchical namespace buckets offer up to 8 times higher initial object read and write QPS compared to their flat counterparts. This facilitates quicker ramp-up times and better utilization of compute resources, which is vital for maximizing efficiency in complex machine learning tasks.
Insights from Google’s Engineering Team
Jason Stevens, a Senior Director of Engineering at Google, emphasized the transformative impact of the Hierarchical Namespace feature. He stated, “Google Cloud Storage (GCS) Hierarchical Namespace (HNS) accelerates storage workloads that rely on filesystem semantics—like folder renames—boosting efficiency for AI workloads. With up to 20x faster checkpointing and 8x higher QPS, HNS helps maximize GPU and TPU utilization for AI/ML pipelines.” This endorsement underscores the strategic importance of adopting HNS for organizations invested in AI and ML.
Enabling Hierarchical Namespace in Google Cloud Storage
To take advantage of the Hierarchical Namespace feature, users must configure it at the time of bucket creation, as it cannot be retroactively enabled on existing buckets. For those familiar with the command line, this can be accomplished using the gcloud CLI. The command gcloud storage buckets create --enable-hierarchical-namespace allows users to create a new bucket with HNS enabled, specifying the desired bucket name and location.
Alternatively, for those who prefer a graphical interface, the process is straightforward in the Google Cloud Console. Users simply navigate to the Cloud Storage section, select “Create bucket,” and in the Advanced settings, check the option to enable the hierarchical namespace before completing the setup process. Once activated, the bucket becomes optimized for AI and ML use cases, supporting filesystem-like folder structures, atomic renames, and enhanced throughput for both read and write operations.
Conclusion
The introduction of Google Cloud’s Hierarchical Namespace marks a pivotal shift for organizations leveraging AI and machine learning. By significantly improving the efficiency of data operations and optimizing performance, HNS stands to empower data scientists and engineers to work more effectively. As businesses increasingly rely on AI and ML technologies, adopting innovative solutions like HNS is essential for staying competitive in the evolving digital landscape.
Inspired by: Source

