Reassessing the Role of AGENTS.md in AI Coding Assistance
Introduction to AGENTS.md and AI Coding Agents
As artificial intelligence (AI) becomes increasingly integrated into software development, there is an ongoing debate about how best to support these intelligent coding agents. One central element in this discussion is the use of context files like AGENTS.md. Despite widespread encouragement from the industry, a recent study conducted by a team from ETH Zurich raises significant questions about the efficacy of such files. The paper argues that AGENTS.md files may actually hinder performance rather than enhance it.
The Study’s Rationale and Methodology
The research team, consisting of Thibaud Gloaguen, Niels Mündler, Mark Müller, Veselin Raychev, and Martin Vechev, notes that around 60,000 open-source repositories currently utilize context files, including AGENTS.md. These files are intended to provide AI agents with essential instructions or information about the coding environment. However, the authors point out a lack of rigorous empirical evidence to support the claim that these files improve the performance of AI agents.
To address this gap, the researchers developed AGENTbench, a novel dataset showcasing 138 real-world Python tasks sourced specifically from niche repositories. By avoiding more popular benchmarks that AI models might have memorized, they sought to assess the genuine impact of context files on AI task completion.
Experimental Setup and Findings
The study involved testing four AI coding agents — Claude 3.5 Sonnet, Codex GPT-5.2 and GPT-5.1 mini, and Qwen Code — across three scenarios: no context file, an LLM-generated file, and a human-written file. The researchers measured performance through proxy indicators such as task success rates, the number of steps taken by the agent, and overall inference costs.
The results were intriguing. The researchers found that LLM-generated context files negatively impacted performance, lowering task success rates by an average of 3% compared to scenarios with no context file. Additionally, reliance on these AI-generated files resulted in over a 20% increase in inference costs due to additional steps taken by the agents.
On the flip side, while human-written context files exhibited a slight edge — boosting task success rates by 4% — they similarly led to an increase in steps taken, raising costs by around 19%. These findings suggest that while human-written files have some value, they also come with trade-offs.
Insights from Trace Analysis
The team conducted a detailed trace analysis to unravel why the inclusion of AGENTS.md files was leading to poorer outcomes for coding agents. They found that AI agents tended to follow the instructions provided in the context files, resulting in more extensive operations—like running additional tests and executing more searches—even when such actions were unnecessary for completing the specific tasks. Essentially, agents appeared to be “overthinking,” which did not translate into better performance.
Developer Reactions
Developers have received the research with a mix of skepticism and interest. Many believe that the study highlights a need for developers to focus on the quality and utility of AGENTS.md files rather than dismissing them altogether. One developer pointed out that AGENTS.md files can serve as repositories of domain knowledge that evolve over time. The developer emphasized that understanding nuanced aspects of a project—like specific coding patterns or legacy constraints—can vastly improve the functionality of AI coding agents.
Another developer echoed this sentiment, sharing their experience of maintaining a CLAUDE.md file. They noted that the act of articulating thoughts about the codebase helped improve team communication and onboard new members, even if the direct token-level context provided to the AI wasn’t the primary benefit.
The Future of Context Files
The implications of this research suggest that context files like AGENTS.md may need a significant reevaluation. As the use of such files grew in prominence in late 2025 alongside a broader push from AI coding agent providers, understanding their effectiveness has become increasingly crucial. The findings highlight the necessity for manual input in context files and the potential detriments of automated generation.
In conclusion, while this study challenges existing norms regarding context files, it opens the door to a nuanced conversation about the future of AI-assisted development. The path forward may involve not just refining how we create these files, but reevaluating their function altogether to better serve both AI coding agents and their human collaborators. The conversation around the utility and impact of context files, especially AGENTS.md, is just beginning, and it’s one that will undoubtedly shape the evolution of AI in software development.
Inspired by: Source

