How Monzo Revolutionized Its Data Warehouse: A Deep Dive into Their Innovative “Meshy” Approach
Monzo, a leading digital bank in the UK, has recently taken significant strides in optimizing its data management by redesigning its data warehouse. This transformation was aimed at supporting over 100 independent teams that work on more than 12,000 dbt models, and as a result, they achieved remarkable improvements in both cost-efficiency and data delivery speed.
The “Meshy” Data Architecture
Monzo introduced a “meshy” approach to its data architecture, which contributed to a staggering 40% reduction in warehouse costs and a 25% increase in data delivery speed. This innovative system reorganizes data models into defined layers, allowing for a clearer structure and more efficient data management. With teams collaborating across a shared platform, this architecture allows each group to own and maintain its own data models, promoting distributed ownership.
Empowered Teams and Cross-Functional Collaboration
Empowered and independent teams play a critical role in Monzo’s data strategy. Each team is responsible for its own data models, supported by automated guardrails and shared tooling that ensure data integrity. Antonia Badarau, Irina Mugford, and Massimo Frangiamore, analytics engineers at Monzo, shed light on the complexities that come with such distributed ownership. They emphasize the need for performance, consistency, and quality in a landscape where AI-assisted coding is becoming commonplace:
“At Monzo, over 100 independent, empowered teams contribute to our data warehouse of 12,000+ dbt models. The health of data is owned across all these teams. That kind of distributed ownership is powerful, but it’s also hard to get right at scale.”
The Structure of Monzo’s Data Models
The bank’s data models are categorized into four distinct layers, each serving a unique purpose:
- Automated Landing Models: These models flatten raw events and ensure quick access to raw data.
- Generated Normalized Models: They represent entities with complete historical context.
- Logical Models: These integrate business logic by combining different entities.
- Presentation Models: Tailored for specific downstream uses, these models ensure that data is delivered in the most relevant format.
By structuring the models into these layers, Monzo enhances the efficiency of data sharing and access across its teams.
Ensuring Consistency and Quality with Modelgen
To maintain consistency, Monzo employs a command-line tool known as Modelgen, which generates SQL and YAML models from object definitions. This level of automation is complemented by continuous integration (CI) checks that validate structure, naming conventions, and data access patterns.
“Scaling data in any fast-growing organization isn’t easy, never mind a bank,” says Luke Briscoe, Engineering Director at Monzo Bank. “I’m not aware of many companies that run tooling like this (or at least that publicly talk about it!).”
Additionally, Mateusz Ulas, founder of Expeditious Software, emphasizes the importance of treating data interfaces as first-class code, highlighting that most organizations still rely on documentation alone and hope for the best. Monzo’s approach of embedding standards into CI is what sets it apart.
Creating a Cohesive Data Environment
The clear layering of data models, combined with stable interfaces between datasets and automated CI checks, creates an organized environment. This structure allows teams to operate independently while effectively reducing costs and improving processing times.
Monzo also mandates essential quality checks for each model, requiring them to:
- Define a unique key.
- Include freshness tests.
- Run incrementally by default.
- Declare an owning team.
- Provide definitive documentation.
- Follow strict naming and metadata conventions validated in CI.
Positive Early Results and Future Aspirations
According to Badarau, Mugford, and Frangiamore, Monzo is about 30% through a company-wide migration to these innovative approaches and systems. The initial results are promising, showing a 40% cost reduction and 25% faster data landing times in various domains:
“We’ve seen ~40% cost reduction and ~25% faster landing times in some domains – but it’s early days still.”
In addition to its data management advancements, Monzo plans to further enhance its processes. For example, the engineering team is exploring the use of multi-task neural networks to detect fraud patterns, developing capabilities to identify rare and previously unseen behaviors beyond the reach of conventional models. Moreover, during this year’s QCon London event, Suhail Patel highlighted how Monzo has engineered a developer platform capable of executing hundreds of production changes daily.
Inspired by: Source

