The Evolution of Google’s Codebase: Migrating from 32-bit to 64-bit IDs with AI Assistance
As technology continues to advance, so do the frameworks and systems that underpin major platforms like Google. One of the most significant developments in recent years has been the shift from 32-bit to 64-bit integer IDs in Google Ads. This migration is not merely a technical upgrade; it’s a necessary evolution that reflects the exponential growth of data and the need for robust systems to handle it. In this article, we’ll delve into the challenges of this migration, the innovative AI tools employed, and the impact on software engineering workflows.
The Challenge of Overflowing IDs
Google Ads has long relied on a variety of numerical unique IDs that serve as handles for users, merchants, campaigns, and more. Initially defined as 32-bit integers, the rapid growth in the number of IDs has prompted concerns about reaching the overflow limit. The implications of this overflow can be significant; it could lead to data integrity issues and malfunctioning systems if not addressed promptly.
The realization of this looming issue sparked a comprehensive effort to transition these IDs to 64-bit integers. However, this undertaking is fraught with challenges. For starters, there are tens of thousands of instances across thousands of code files where these IDs are utilized, making the transition a monumental task.
Complexities of the Migration Process
The challenges associated with this migration go beyond mere code changes. Here are some key complexities involved:
-
Widespread Usage: The IDs are scattered across numerous files, which complicates tracking changes and ensuring consistency across the codebase.
-
Team Coordination: With multiple teams involved, managing the migration process can lead to communication bottlenecks and potential misalignment if each team were to handle migration independently.
-
Generic ID Definitions: Many IDs are defined as generic number types (like
int32_tin C++ orIntegerin Java), making it difficult to locate them using static tools. -
Class Interface Changes: Alterations in class interfaces necessitate updates across multiple files, thereby increasing the workload.
- Testing Requirements: After migration, extensive tests must be conducted to ensure that the new 64-bit IDs are functioning correctly without introducing bugs.
Given these complexities, a manual approach to the migration would have required an extensive investment of software engineering years, which was not feasible.
Embracing AI for Efficient Migration
To tackle this significant challenge, Google turned to AI migration tools, which streamlined the process and reduced the overall workload. Here’s how the AI-driven workflow operates:
-
Identifying the Migration Scope: An experienced engineer selects the ID to migrate and, with the help of tools like Code Search and Kythe, pinpoints a focused set of files and locations that require changes.
-
Autonomous Migration Toolkit: The migration toolkit autonomously executes the necessary changes, ensuring that only verified code passes unit tests. This includes updating some tests to align with the new ID structure.
- Review and Refinement: After the initial migration, the engineer reviews the changes and addresses any errors made by the AI. The changes are then distributed to multiple reviewers responsible for the affected code segments.
Maintaining Privacy and Security
Throughout this process, Google ensures that privacy protections remain intact. The IDs used in the internal codebase are safeguarded, and even as the migration occurs, the AI model does not alter or expose these sensitive identifiers. This commitment to privacy is crucial in maintaining user trust and adhering to data protection regulations.
Impact and Results of the AI Migration
The results of employing AI in the migration process have been impressive. Reports indicate that approximately 80% of the modifications made in the code were authored by AI, significantly reducing the engineering time spent on the project by an estimated 50%. This efficiency not only accelerates the migration but also minimizes communication overhead, allowing a single engineer to oversee the entire process.
Moreover, the AI model demonstrated remarkable accuracy in predicting the need for file edits, achieving a 91% success rate in Java files. The toolkit has already facilitated the creation of hundreds of change lists in various migration efforts. On average, over 75% of AI-generated character changes successfully made their way into the monorepo, highlighting the reliability of the AI-assisted migration approach.
By leveraging advanced AI tools, Google is not just addressing immediate technical needs but is also setting a precedent for future migrations and codebase evolution. As technology continues to advance, the integration of AI in software engineering processes will likely become a standard practice, optimizing workflows and enhancing productivity.
Inspired by: Source

