Google’s Breakthrough Interoperability Features for Apache Iceberg in BigQuery
The recent Apache Iceberg Summit marked a significant turning point for data management, particularly with Google’s announcement of new interoperability features within BigQuery. With the spotlight on a serverless Iceberg REST catalog, teams can now create, update, and query Apache Iceberg tables using both BigQuery and other engines such as Spark, Flink, and Trino—all without duplicating their data.
Seamless Data Management
One of the most desirable aspects of the new preview is that it allows multiple tools to access the same datasets without resorting to data copying or proprietary formats. This flexibility is crucial for companies aiming for efficiency and agility. As Yuriy Zhovtobryukh, a senior product manager at Google, emphasizes, “If you’re building a lakehouse today, you’re probably using Apache Iceberg.” This platform has gained remarkable traction among teams needing to accommodate various compute engines that interact with the same data for differing workloads.
Cross-Cloud Lakehouse Support
The future of data lakes seems to be heading towards a more integrated ecosystem. During Next ’26, Google expanded Iceberg interoperability into a comprehensive cross-cloud lakehouse framework. This allows querying of Iceberg catalogs across multiple major players like AWS, Azure, Databricks, and Snowflake. As Angela Soares, senior product marketing manager at Google, highlights, the overarching goal is to maintain data in open formats while facilitating usage across varied processing and analytics tools on identical datasets.
Reducing Operational Complexity
Despite the growing popularity of Apache Iceberg, many organizations still grapple with higher costs and operational complexities, especially concerning streaming data and replication pipelines. Google aims to mitigate these issues by extending its BigQuery infrastructure to provide managed services that encompass metadata support, automatic table maintenance, and transaction functionalities. Zhovtobryukh points out that customers previously had to choose between Iceberg tables in the Google-managed Iceberg REST catalog or those managed by BigQuery, depending on their primary ETL engine.
Enhanced Access Control
In addition to improved data management, the preview brings centralized table access controls. This feature allows organizations to manage permissions seamlessly across different query engines. The latest developments mean that Google Cloud now offers broad querying capabilities for Iceberg data not just within its own ecosystem, but also across AWS and Azure, as well as interoperability with external platforms such as Databricks and Snowflake.
Integration with AI and Unstructured Data
BigQuery ObjectRefs are now generally available, enabling teams to amalgamate structured Iceberg data with unstructured files residing in Cloud Storage. This fusion is ideal for those involved in multimodal analysis and AI workflows. Moreover, the Knowledge Catalog—previously known as Dataplex—serves as a governance layer, facilitating metadata management, lineage tracking, and access controls across various systems.
Addressing Adoption Challenges
Industry practitioners are optimistic that these advancements will alleviate some of the “hidden taxes” associated with adopting Iceberg. According to David Colbert, overcoming challenges related to compaction, metadata management, and orchestration is essential. He emphasizes the importance of the catalog as a critical component. “Open formats solve storage portability, but control plane choices determine long-term optionality,” he notes.
The Expanding Role of Google Cloud
Experts observe that Google is making strategic bets, aiming to capitalize on the enterprise AI value derived not just from storage but from the intelligence layered over data. As Precious Pendo discusses, this vision differentiates Google Cloud from competitors like AWS and Azure, which largely charge based on compute and storage usage.
Industry-Wide Adoption of Apache Iceberg
Google isn’t alone in its focus on Iceberg workloads; other cloud providers like AWS offer native support through their analytics services such as EMR, Glue, Athena, and Redshift. Shashank Muthuraj from Red Oak Strategic mentions how Apache Iceberg has swiftly transitioned from a Netflix engineering project to a standard in open data lakehouse architecture within just a few years. He attributes its rapid adoption to features like ACID transactions, hidden partitioning, and time travel, alongside a cohesive industry alignment.
Future Outlook
While the core managed Iceberg table support in BigQuery has become generally available, the expanded interoperability and REST catalog capabilities announced at the Iceberg Summit are still in preview. This strategic move may dictate the future of how organizations interact with their data across various cloud environments.
By fostering these advancements, Google is paving the way for a more interconnected and manageable data landscape, making it easier for teams to harness the power of their datasets while keeping complexity and costs in check.
Inspired by: Source

