Skip to content

Building vibe-check

November 29, 2025·4 min read
#vibe-coding#ai-development#developer-tools#open-source

1,563 Lines of Math That Solved a Problem Nobody Had

Five hours into building vibe-check, an ML prediction system existed. Ordered Logistic Regression. Expected Calibration Error. The whole academic stack. It predicted which trust level to use for a given task.

The problem: you already know what task you're doing. Nobody needs a model to tell them that OAuth integration is riskier than formatting code.

bash

git rm -rf src/recommend/ # Ordered logistic regression git rm -rf src/calibration/ # ECE calculations git rm src/commands/level.ts # The prediction command

1,563 lines deleted. One commit. Twenty-one hours after implementing it.

In the two hours after that deletion, more useful features shipped than in the entire previous day. The ML system wasn't just unnecessary -- it was occupying the mental space where real features needed to be.

MetricValue
Time building ML5h 18m
Time ML existed21h 31m
Lines deleted1,563
Time from delete to next ship1h 58m

The lesson was about trust calibration. At L4 (high trust), the instinct was to ask the AI to "fix the ML tests." It would have done it. The feature would have shipped and required maintenance forever. By dropping to L1 (verify every line), the real problem became visible: the feature itself was wrong.

That experience is why vibe-check exists. Not to predict anything -- just to surface patterns from your git history so you can catch yourself before sinking five hours into the wrong thing.


The Five Core Metrics

All five come from git history rather than code content. The tool never reads your source files, just commit metadata. Timestamps can't be gamed, and behavior reveals more than intentions.

1. Trust Pass Rate

The percentage of commits that don't require an immediate fix. When a commit lands and no fix follows within 10 minutes, that's a trust pass. A high rate means your calibration is accurate -- you're trusting AI on tasks where it's reliable. A low rate means you're over-trusting on complex work.

2. Rework Ratio

Fix commits as a percentage of total work. Some rework is healthy; zero fixes probably means you're over-verifying. But when rework climbs above 25%, you're spending more time correcting than building.

3. Debug Spirals

Three or more consecutive fix commits on the same component. One fix is normal. Two happens. Three means you're patching symptoms while the AI keeps generating broken code. The count tells you how often you get stuck.

4. Spiral Duration

Total time spent inside those fix loops. Five minutes of debugging is fine. Forty-five minutes means you should have stepped back, switched approaches, or dropped to a lower trust level twenty minutes ago.

5. Flow Efficiency

The meta-metric: (Active time - Spiral duration) / Active time. Are you in a productive flow state, or stuck in the weeds? Active time comes from commit timestamps; spiral duration is subtracted to get productive building time.


Vibe Levels

The framework underneath these metrics is a trust scale you declare before starting work:

LevelTrustVerificationExample Tasks
L595%Final onlyFormatting, linting
L480%Spot checkBoilerplate, config
L360%Key outputsCRUD, standard tests
L240%Every changeFeatures, integrations
L120%Every lineArchitecture, security
L00%N/ANovel research
L595% trust L360% trust L120% trust

Declaring the level upfront forces a decision about what kind of task you're doing. After the session, comparing what happened to what you expected sharpens your intuition over time.


The Tool

bash

npm install -g @boshu2/vibe-check

Or run directly:

bash

npx @boshu2/vibe-check

Sample output:

// terminal

$ vc --since "1 week ago"

VIBE-CHECK Nov 21 - Nov 28

Trust: 94% Rework: 18% Spirals: 1 detected (12 min) Flow: 87%

> INFO:

vibe-check is a tool for you, not your manager. It measures your own patterns so you can improve your AI collaboration. Don't use it to measure other people.


Try It

bash

Install

npm install -g @boshu2/vibe-check

Run your first check

vc --since "1 week ago"

Or use npx

npx @boshu2/vibe-check

Links: npm · GitHub