From the engine room.
Case studies, benchmarks, and observations from building the agent output pipeline.
May 5, 2026
Essay
CI Passed. The Merge Still Broke.
Vercel just open sourced Open Agents. It shows where development is headed: multiple background agents, multiple branches, multiple PRs, all moving at once. That is powerful. It is also where the current workflow starts to crack. Because a PR can be correct by itself and still be wrong in context.
Read →
April 10, 2026
Findings
76% of cross-branch conflicts come from AI agent branches.
We scanned eight open source codebases this week. In the six normal repos, the engine found 850 cross-branch compatibility conflicts that no existing tool catches. Three out of four came from branches authored by AI coding agents. Here's the data, the false positives we caught before publishing, and how we validated the agent attribution.
Read →
March 11, 2026
Essay
Git merges text, not logic.
When multiple AI coding agents work on the same codebase simultaneously, they produce patches that are individually correct but collectively incompatible. This problem didn't exist when humans wrote code. Here's why it exists now, and what we're building to fix it.
Read →
March 11, 2026
Benchmark
18 conflicts, 5 branches, 0.97 seconds.
We ran 5 simulated AI agents on the same codebase - Cursor on Python, Copilot on Go, Codex on TypeScript, Claude Code on cross-language, Windsurf on Ruby. Git merged everything cleanly. Tests passed. Here's what Rosentic found.
Read →
March 11, 2026
Case Study
What Alibaba's SWE-CI tells us about the next 12 months.
Alibaba tested AI coding agents on 100 real codebases spanning 233 days. 75% of models broke previously working code during maintenance. The implications for production engineering teams are significant.
Read →
More posts on the way.
We're scanning public repos, running benchmarks, and writing about what we find. Subscribe below to get notified.