Roaster
RU / EN
We scored 50k PRs with AI

We scored 50k PRs with AI

I'm a CTO with a ~16-person engineering team. Last year I wanted real data on what was actually shipping, not guesswork or story point theater. So we built GitVelocity. Every merged PR gets scored 0–100 by Claude across six dimensions: scope (0–20), architecture (0–20), implementation (0–20), risk (0–20), quality (0–15), perf/security (0–5). Six dimensions added up, then scaled by change size — a 10-line fix scores lower than a 500-line refactor even at the same complexity. Full formula at gitvelocity.dev/scoring-guide. After scoring 50,000+ PRs across TypeScript, Python, Rust, Go, Java, Elixir, and more, some things surprised us: Big PRs don't automatically score high. An 800-line migration with low complexity scores worse than a 200-line architectural change. Size gets you the full multiplier, but the base score still has to earn it. You can't score well without tests. The quality dimension (0–15) won't give you points without test coverage. At similar experience levels, this was the clearest separator between engineers. Juniors started outscoring some seniors. They adopted AI tools faster and took on harder problems. Once they could see their own scores, they aimed higher. We score AI-generated code the same as human-written code. Code is code. An engineer who uses AI to ship more complex work faster is more productive, and their scores reflect that. Scoring consistency was the hardest technical problem. Without reference examples anchoring each dimension, Claude's scores drifted 15+ points between runs. With 18 calibrated anchors (three per dimension at low/mid/high), we got it down to 2–4 points on the same PR. The thing we didn't expect was behavioral. We call it the Fitbit effect — the tool doesn't make you ship better code, but seeing the score does. Engineers started referencing their own scores in 1:1s unprompted, because the numbers matched what they already felt about their work. A junior who shipped a tricky concurrency fix could point to a score that proved it wasn't "just a small PR." We recently added team benchmarks (gitvelocity.dev/demo/benchmarks). Once you're scoring PRs, you can see how your team compares to others across the dataset — about 1,000 engineers on 60 teams so far. Headline's team ships faster than roughly 95% of them, which was nice to confirm but also made us wonder who the other 5% are. The competitive angle surprised us: teams that were skeptical about individual scores got genuinely curious once they could measure themselves against the field. Every score is fully visible to the engineer who wrote the PR, with per-dimension breakdowns and reasoning. There's no hidden dashboard that management sees and engineers don't. Free, BYOK (your Anthropic API key). We default to Sonnet 4.6, which scores nearly as well as Opus 4.6 at a fraction of the cost — but you can switch models if you want. Pennies per PR either way. No source code stored, diffs analyzed and discarded. Works with GitHub, GitLab, and Bitbucket. Ask me anything about the scoring methodology, how we solved calibration, or what it was actually like rolling this out to a team.

Developer Tools B2B · chuboy
N/A
Данные о доходе недоступны

AI-анализ

Анализ скоро появится.

Похожие продукты

Developer Tools
Capgo

Capgo

Мгновенные обновления для Capacitor-приложений. Выпускайте исправления за минуты, а не недели. Отправляйте OTA-обновления пользователям без задержек App Store.

$15.2K /мес
Developer Tools Легко клонировать
OpenAlternative

OpenAlternative

OpenAlternative — каталог open-source альтернатив проприетарному софту. На сайте собраны проекты из разных категорий с информацией о возможностях, стеке технологий и метриках GitHub. Платформа монетизируется через платные размещения и партнёрские ссылки.

$6.7K /мес
Developer Tools
Artemis.fyi

Artemis.fyi

There are plenty of Artemis II trackers out there. I looked at a bunch and kept running into the same issues - some had data that didn't look right, it was hard to use on smaller screen, others felt overly complicated for what I actually wanted to know: what's the crew doing, where is Orion, how fast is it going. The best one I found was issinfo.net/artemis, which inspired a lot of the design. So I built my own. The part that was genuinely interesting to me was the data. Turns out anyone can query JPL's Horizons API for full ephemeris data on the Orion spacecraft - position, velocity, range - for free. I had no idea this existed. Even better: NASA's Deep Space Network publishes a live XML feed (eyes.nasa.gov/dsn/data/dsn.xml) that updates every 5 seconds showing exactly which ground antennas are talking to which spacecraft. Right now two dishes in Canberra are locked onto Orion - one sending commands, both receiving 6 Mbps of S-band telemetry at 296,000 km. You can see Juno at Jupiter, JWST, Mars Odyssey, all in the same feed. It's pretty amazing what's just sitting there in the open. The app fetches trajectory from Horizons, crew activities from NASA's published flight plan, and live ground station status from DSN. I'll be honest - it's mostly vibe-coded with supervision. The data pipeline is the part that was more manual: figuring out what's publicly available, how to compute relative positions from raw vectors, how to cache and backfill. That was the fun part. Code is open on GitHub. I built it for myself and as a fun exercise, but happy for any feedback - especially around data correctness and what other public data sources are out there that I might be missing. Source: https://github.com/dmarchuk/artemis.fyi

Доход N/A
Developer Tools
sllm

sllm

Running DeepSeek V3 (685B) requires 8×H100 GPUs which is about $14k/month. Most developers only need 15-25 tok/s. sllm lets you join a cohort of developers sharing a dedicated node. You reserve a spot with your card, and nobody is charged until the cohort fills. Prices start at $5/mo for smaller models. The LLMs are completely private (we don't log any traffic). The API is OpenAI-compatible (we run vLLM), so you just swap the base URL. Currently offering a few models.

Доход N/A
Developer Tools
Ismcpdead.com

Ismcpdead.com

Built this to track the ongoing debate around Model Context Protocol - whether it's gaining real traction or just hype. Pulls live data from GitHub, HN, Reddit and a few other sources. Curious what the HN crowd thinks given how active the MCP discussion has been here.

Доход N/A

Ключевые факты

Категория
Developer Tools
Аудитория
B2B
Основатель
chuboy
Данные о доходе
Неизвестно

Поделиться

Twitter LinkedIn