At the end of March, the PyTorch Korea User Group hosted a special meetup that brought together prominent speakers for deep discussions on the PyTorch core and its broader ecosystem. With the event more than doubling in size compared to past gatherings, we were able to connect with even more developers and share insights. Huge thanks to goorm for sponsoring the fantastic venue! 😄
This recap is for those who couldn’t attend in person, as well as for participants who want to revisit the energy and insights of the day. The event featured experts in core PyTorch, AI accelerators, inference optimization, and large language model development. Below is a quick overview of the key sessions that anchored the conference.
1️⃣ Jerry Lee | PyTorch Foundation
Representing the PyTorch Foundation, part of the Linux Foundation, Jerry Lee provided a comprehensive overview of how PyTorch is advancing core open-source technologies. He shared insights into PyTorch’s impressive growth trajectory, highlighting the many global projects currently in motion and the ecosystem’s robust annual growth rate of over 20%. This session also delved into the foundation’s operational dynamics, member involvement, and future plans that can greatly benefit practitioners across the community.
2️⃣ Alban Desmaison | PyTorch Roadmap
Alban Desmaison took the stage to discuss the design philosophy that underpins PyTorch and outline Meta’s official contribution roadmap. His session provided a technical deep dive into the distinctions between Eager and Compiled modes, especially focusing on the backend architecture that supports device Eager execution. Alban introduced practical tools and enhancements like memory profilers, improved custom operator support, and pinned memory optimizations, all crucial for developers aiming to maximize performance.
3️⃣ Hongseok Kim | PyTorch on Rebellions AI Accelerators: Status
Hongseok Kim introduced us to Rebellions, a company working on runtime integration for their proprietary NPU architecture, fully aligned with the latest advancements in PyTorch 2.0. This talk highlighted the performance and scalability of their upcoming chip, their integration strategy with the PyTorch runtime, and the challenges involved in supporting Eager Mode. Hongseok also gave a sneak peek into their roadmap for releasing these features within the year, exciting news for developers eager to leverage new hardware capabilities.
4️⃣ Kyujin Cho | Backend.AI: A Unified Platform for All AI Accelerators
Kyujin Cho presented Backend.AI, which abstracts and integrates various AI accelerators into a cohesive workflow. With the landscape of accelerator architectures becoming increasingly diverse, the need for portability and infrastructure unification is more critical than ever. This session showcased Backend.AI’s capabilities across development and operational aspects, including NPU scheduling, resource allocation, and monitoring, emphasizing its support for accelerators from major players like NVIDIA, Intel, Tenstorrent, and Rebellions.
5️⃣ Taeho Kim | Optimizing & Deploying Models Across Multiple Chipsets Using NetsPresso
Taeho Kim addressed the challenges of inference in real-world industrial applications of AI models. As new state-of-the-art models continue to emerge, the necessity for environments that can quickly validate device compatibility becomes paramount. NetsPresso is actively developing a static graph representation compatible with PyTorch, aimed at providing efficient support for model development, optimization, and testing, thereby streamlining workflows for data scientists and engineers.
6️⃣ Jungyeop Lee | The Journey to Reproduce Deepseek-R1
Jungyeop Lee shared his journey of reproducing Deepseek, a large language model, which involved an impressive 201 experiments. His presentation included real-world lessons learned from training with Korean data, modifications to tokenizers, and fine-tuning strategies. Jungyeop’s practical insights were particularly valuable for anyone looking to build or re-implement large models from scratch, providing an honest look at the complexities and challenges faced during development.
7️⃣ Sol Kim | A Journey from TCP Architecture to Production-Level LLMs
Sol Kim offered an integrated optimization approach to deploying large models using the TCP (Tensor Contraction Processor) architecture, which supports tensor contraction at the hardware level. His talk highlighted various optimization techniques built on hardware abstraction layers (HALs) and bottom-up integration strategies with PyTorch, providing a hybrid hardware-software perspective that can significantly enhance the efficiency of large model deployments.
💡 Panel Talk & Q&A 💡
The event wrapped up with an engaging panel discussion where attendees posed sharp questions, and the speakers provided insightful answers. This interactive session captured the community’s enthusiasm for PyTorch and their eagerness for deeper technical understanding, fostering a collaborative atmosphere that resonates with the spirit of the PyTorch ecosystem.
Final Thoughts
Since our first offline meetup in October 2022, the PyTorch Korea User Group has held five major technical conferences. Each event deepens our appreciation for the scale and depth of the PyTorch ecosystem. With perspectives from users, contributors, and ecosystem builders, the stories we share are only growing—and we’re committed to continuing this journey together.
See you at the next conference—with even more exciting talks to come! 🙌
Inspired by: Source
- 1️⃣ Jerry Lee | PyTorch Foundation
- 2️⃣ Alban Desmaison | PyTorch Roadmap
- 3️⃣ Hongseok Kim | PyTorch on Rebellions AI Accelerators: Status
- 4️⃣ Kyujin Cho | Backend.AI: A Unified Platform for All AI Accelerators
- 5️⃣ Taeho Kim | Optimizing & Deploying Models Across Multiple Chipsets Using NetsPresso
- 6️⃣ Jungyeop Lee | The Journey to Reproduce Deepseek-R1
- 7️⃣ Sol Kim | A Journey from TCP Architecture to Production-Level LLMs
- 💡 Panel Talk & Q&A 💡
- Final Thoughts









