/builds/vibe-check
// validation signals for AI-assisted development
Fast AI output needs instrumentation. vibe-check turns git history into a reliability signal: what stuck, what churned, and when the session started spiraling.
It tells you whether an AI-assisted session produced durable progress or just confident motion.
// the 5 core metrics
How tight are feedback loops?
Building or debugging?
Does code stick?
How long stuck?
What % productive?
Trust Pass Rate is the key metric. It measures whether the trust level matched the task risk.
npx @boshu2/vibe-check// why this matters
AI reliability varies by task type. Formatting is nearly always correct. Architecture needs line-by-line verification. The vibe levels answer the question: when can you trust AI output, and when do you need to verify every line?
Declaring the level upfront forces you to think about what kind of task you're doing. After the session, compare what actually happened to what you expected.
// the 40% rule
Gene Kim and Steve Yegge found a hard threshold in their research. When context utilization stays under 40%, success rate is 98%. Above 60%, it drops to 24%. The AI starts forgetting instructions and contradicting itself.
This is why spiral detection matters. When you're stuck in a fix loop, context fills up fast.
// the insight
Git history is the receipt. The commits tell the truth.
vibe-check analyzes your commit patterns to detect debug spirals before they consume your whole session. If you're stuck for 30 minutes on the same thing, that's a wipe, reset, do some research, and come back with a plan.
"Last week, the CLI flagged a spiral at 18 minutes. I realized I was arguing with the LLM about a circular dependency. I stepped away, drew the schema on paper, and fixed it in one commit. Without the alert, I would have wasted two hours."
// results
I've been running this methodology since 2023. When I follow the discipline, it works. When I skip calibration because I'm in a hurry, I pay for it in rework.
// for autonomous agents
vibe-check measures human-AI collaboration sessions. For autonomous agent workflows, 12-Factor AgentOps applies DevOps and SRE discipline to the delivery system around the agents.
12-Factor AgentOps →