I tracked Claude Code and Codex pass-rates for 95 days — what "getting dumber" actually looks like
Every few weeks a thread blows up: "Is Claude Code getting worse?" Someone swears Opus got lazy after an update; someone else says it's placebo. The arguments are always vibes — nobody posts numbers. So I built a tracker. For ~95 days it's logged the daily SWE-Bench-Pro pass rate for Claude Code and








