Building vibe-check
1,563 Lines of Math That Solved a Problem Nobody Had
Five hours into building vibe-check, an ML prediction system existed. Ordered Logistic Regression. Expected Calibration Error. The whole academic stack. It predicted which trust level to use for a given task.
The problem: you already know what task you're doing. Nobody needs a model to tell them that OAuth integration is riskier than formatting code.
git rm -rf src/recommend/ # Ordered logistic regression git rm -rf src/calibration/ # ECE calculations git rm src/commands/level.ts # The prediction command
1,563 lines deleted. One commit. Twenty-one hours after implementing it.
In the two hours after that deletion, more useful features shipped than in the entire previous day. The ML system wasn't just unnecessary -- it was occupying the mental space where real features needed to be.
| Metric | Value |
|---|---|
| Time building ML | 5h 18m |
| Time ML existed | 21h 31m |
| Lines deleted | 1,563 |
| Time from delete to next ship | 1h 58m |
The lesson was about trust calibration. At L4 (high trust), the instinct was to ask the AI to "fix the ML tests." It would have done it. The feature would have shipped and required maintenance forever. By dropping to L1 (verify every line), the real problem became visible: the feature itself was wrong.
That experience is why vibe-check exists. Not to predict anything -- just to surface patterns from your git history so you can catch yourself before sinking five hours into the wrong thing.
The Five Core Metrics
All five come from git history rather than code content. The tool never reads your source files, just commit metadata. Timestamps can't be gamed, and behavior reveals more than intentions.
1. Trust Pass Rate
The percentage of commits that don't require an immediate fix. When a commit lands and no fix follows within 10 minutes, that's a trust pass. A high rate means your calibration is accurate -- you're trusting AI on tasks where it's reliable. A low rate means you're over-trusting on complex work.
2. Rework Ratio
Fix commits as a percentage of total work. Some rework is healthy; zero fixes probably means you're over-verifying. But when rework climbs above 25%, you're spending more time correcting than building.
3. Debug Spirals
Three or more consecutive fix commits on the same component. One fix is normal. Two happens. Three means you're patching symptoms while the AI keeps generating broken code. The count tells you how often you get stuck.
4. Spiral Duration
Total time spent inside those fix loops. Five minutes of debugging is fine. Forty-five minutes means you should have stepped back, switched approaches, or dropped to a lower trust level twenty minutes ago.
5. Flow Efficiency
The meta-metric: (Active time - Spiral duration) / Active time. Are you in a productive flow state, or stuck in the weeds? Active time comes from commit timestamps; spiral duration is subtracted to get productive building time.
Vibe Levels
The framework underneath these metrics is a trust scale you declare before starting work:
| Level | Trust | Verification | Example Tasks |
|---|---|---|---|
| L5 | 95% | Final only | Formatting, linting |
| L4 | 80% | Spot check | Boilerplate, config |
| L3 | 60% | Key outputs | CRUD, standard tests |
| L2 | 40% | Every change | Features, integrations |
| L1 | 20% | Every line | Architecture, security |
| L0 | 0% | N/A | Novel research |
Declaring the level upfront forces a decision about what kind of task you're doing. After the session, comparing what happened to what you expected sharpens your intuition over time.
The Tool
npm install -g @boshu2/vibe-check
Or run directly:
npx @boshu2/vibe-check
Sample output:
$ vc --since "1 week ago"
VIBE-CHECK Nov 21 - Nov 28
Trust: 94% Rework: 18% Spirals: 1 detected (12 min) Flow: 87%
vibe-check is a tool for you, not your manager. It measures your own patterns so you can improve your AI collaboration. Don't use it to measure other people.
Try It
Install
npm install -g @boshu2/vibe-check
Run your first check
vc --since "1 week ago"
Or use npx
npx @boshu2/vibe-check