Skip to content

The Factory Floor

How AI agents and one engineer produce enterprise-grade software

THE PROCESS

The Production Line

Spec

Every feature starts as a spec file. A terminal brief, a design document, acceptance criteria. The human defines the problem precisely so the agents can solve it without ambiguity.

spec.md
# Terminal Brief: Rate Limiting
## Goal: Add rate limiting to all 68 API endpoints
## Acceptance: 429 responses, per-tier limits, Redis-backed

Agent Build

Claude Code reads the spec and implements the feature. It writes the code, creates the tests, handles edge cases, and formats the PR. A feature that takes a human developer a day takes an agent an hour.

claude-code
$ claude 'Implement rate limiting per spec'
Reading spec... Creating middleware... Writing tests...
✓ 14 files changed, 892 insertions, 12 deletions

Automated Testing

The CI/CD pipeline runs 660+ tests on every PR. Linting, type checking, security scans. No human in the loop. If tests fail, the agent fixes them.

github-actions
Running test suite...
✓ 660 tests passed (83% coverage)
✓ Lint: clean ✓ Types: clean ✓ Security: clean

Human Review

The engineer reviews the PR. Not every line — the tests handle correctness. The human checks architecture decisions, edge cases, and alignment with the product vision. This is the quality gate.

pr-review
$ gh pr view 47 --json title,additions,deletions
{"title":"feat: rate limiting","additions":892,"deletions":12}
$ gh pr merge 47 --squash
✓ Merged

Deploy

Merge to main triggers automatic deployment. Backend to Railway, frontend to Vercel. Zero-downtime deploys. The code is live in minutes.

deploy.log
▶ Deploying to Railway... ✓ backend live
▶ Deploying to Vercel... ✓ frontend live
✓ Zero-downtime deployment complete

Monitor

Sentry catches errors in real time. Usage metrics flow through billing. AI costs are tracked to the penny. If something breaks, we know immediately.

monitoring
Sentry: 0 unresolved errors (last 24h)
Uptime: 99.9% (last 30d)
AI cost: $0.02/student/month
THE STACK

The Toolchain

Claude CodePrimary engineering agent
Claude Haiku 4.5Fast AI question generation
Claude Sonnet 4.5Complex AI recommendations
GitHub ActionsCI/CD pipeline
RailwayBackend hosting
VercelFrontend hosting
SentryError monitoring
StripePayment processing
PostgreSQLDatabase
CloudflareDNS & security
THE NUMBERS

Every Token Counted

We don't just use AI — we measure it. Every AI feature includes cost-per-call tracking, model selection optimization, and budget alerts.

When we migrated our question generator from Sonnet to Haiku 4.5, we cut costs 56% without losing quality. That's not luck — it's engineering discipline applied to AI operations.

$0.02
Per student/month
56%
Cost reduction
5s
Response time
cost-analysis.log
# AI Cost Comparison — Question Generator
Before (Sonnet 4.5):
Cost/call: $0.0045 | Latency: 12.5s
After (Haiku 4.5):
Cost/call: $0.0020 | Latency: 5.0s
Savings: 56% | Quality: equivalent