Transforming Karrot’s Recommendation System: A Cloud-Powered Evolution
Karrot, a prominent platform fostering local communities in Korea, has unveiled significant enhancements to its recommendation system, aimed at providing users with personalized content on their home screens. This transformation involved replacing a legacy system with a scalable architecture that efficiently leverages various AWS services. The decision was driven by challenges related to tight coupling, limited scalability, and reliability issues in the previous solution.
The Evolution of Karrot’s Recommendation System
Challenges with the Legacy System
The initial setup of Karrot’s recommendation system was closely intertwined with its flea market web application, which resulted in hard-coded, feature-specific components. While it utilized scalable data services like Amazon Aurora, Amazon ElastiCache, and Amazon S3, the fragmented approach to data storage and ingestion created inconsistencies. This hindered the introduction of new content types, such as local community posts, job listings, and advertisements.
The lack of a unified, flexible feature store became evident as engineers began to notice data quality issues and the complications arising from fragmented feature storage. Hyeonho Kim, Jinhyeong Seo, and Minjae Kwon from Karrot emphasized the critical role that high-quality input data, or "features," play in machine learning systems. They recognized the necessity for a comprehensive system to manage diverse data types efficiently and feed them into the recommendation models.
Implementing a New Feature Platform Architecture
Setting Ambitious Goals
With an eye towards future growth and product development, Karrot’s technical team embarked on creating a new feature platform. This required establishing technical specifications that addressed serving and ingestion traffic, total data volume, and maximum record sizes.
Three Key Architectural Components
The new architecture was built around three primary components: feature serving, stream ingestion pipeline, and batch ingestion pipeline.
-
Feature Serving Layer:
- This layer was crucial for delivering the latest feature data to Karrot’s recommendation engine. The engineering team devised a multi-level caching strategy and dedicated serving methods tailored to the characteristics of the features.
- Small, frequently accessed datasets were served from in-memory caches on Amazon EKS pods, whereas medium-sized datasets were sourced from Amazon ElastiCache. Infrequently accessed large records were obtained directly from DynamoDB tables, unified under a common schema.
- To handle dynamically computed features or those constrained by compliance issues, a dedicated On-Demand Feature Server EKS service was implemented.
-
Addressing Caching Challenges:
- As the engineers tackled common issues related to caching, they adopted the Probabilistic Early Expirations (PEE) technique. This method helps refresh popular content, thereby diminishing cache stampedes and enhancing latency.
- The use of soft and hard TTLs, along with jitter and write-through caching, alleviated consistency challenges, while negative caching minimized unnecessary database queries.
- Stream and Batch Ingestion Pipelines:
- Karrot’s overhaul included a new ingestion architecture to handle real-time events alongside batch processing. By simplifying ETL logic and validation for the primary stream-processing mechanism, the company was able to efficiently manage complex use cases, such as content embeddings and enriched feature sets using large language models (LLMs).
- The ingestion architecture utilized an event dispatcher and aggregator services on EKS, drawing events from Amazon MSK to effectively address M:N relationships between events and features.
Choosing the Right Tools
Initially, the team considered using Apache Airflow but opted for AWS Batch on AWS Fargate due to its simplicity and cost-effectiveness for batch ingestion. As the project progressed, team members identified areas for enhancement, including limited monitoring capabilities and the absence of DAG support for parallel processing.
Success Metrics and Feature Management
Karrot’s investment in the new feature platform has yielded remarkable results. Post-implementation, the platform has led to a 30% increase in click-through rates and a 70% improvement in conversion rates for article recommendations. The platform seamlessly operates across more than ten different spaces and services, managing over a thousand features related to various content types.
In summary, Karrot’s transformation of its recommendation system illustrates the power of modern architecture powered by cloud services. By addressing the shortcomings of its legacy system and implementing a flexible, scalable platform, Karrot is poised to support its growing community engagement and deliver increasingly personalized experiences to its users.
Inspired by: Source

