The Factory Floor

How AI agents and one engineer produce enterprise-grade software

THE PROCESS

The Production Line

Spec

Every feature starts as a spec file. A terminal brief, a design document, acceptance criteria. The human defines the problem precisely so the agents can solve it without ambiguity.

spec.md

# Terminal Brief: Rate Limiting

## Goal: Add rate limiting to all 68 API endpoints

## Acceptance: 429 responses, per-tier limits, Redis-backed

Agent Build

Claude Code reads the spec and implements the feature. It writes the code, creates the tests, handles edge cases, and formats the PR. A feature that takes a human developer a day takes an agent an hour.

claude-code

$ claude 'Implement rate limiting per spec'

Reading spec... Creating middleware... Writing tests...

✓ 14 files changed, 892 insertions, 12 deletions

Automated Testing

The CI/CD pipeline runs 660+ tests on every PR. Linting, type checking, security scans. No human in the loop. If tests fail, the agent fixes them.

github-actions

▶ Running test suite...

✓ 660 tests passed (83% coverage)

✓ Lint: clean ✓ Types: clean ✓ Security: clean

Human Review

The engineer reviews the PR. Not every line — the tests handle correctness. The human checks architecture decisions, edge cases, and alignment with the product vision. This is the quality gate.

pr-review

$ gh pr view 47 --json title,additions,deletions

{"title":"feat: rate limiting","additions":892,"deletions":12}

$ gh pr merge 47 --squash

✓ Merged

Deploy

Merge to main triggers automatic deployment. Backend to Railway, frontend to Vercel. Zero-downtime deploys. The code is live in minutes.

deploy.log

▶ Deploying to Railway... ✓ backend live

▶ Deploying to Vercel... ✓ frontend live

✓ Zero-downtime deployment complete

Monitor

Sentry catches errors in real time. Usage metrics flow through billing. AI costs are tracked to the penny. If something breaks, we know immediately.

monitoring

Sentry: 0 unresolved errors (last 24h)

Uptime: 99.9% (last 30d)

AI cost: $0.02/student/month

THE STACK

The Toolchain

Claude CodePrimary engineering agent

Claude Haiku 4.5Fast AI question generation

Claude Sonnet 4.5Complex AI recommendations

GitHub ActionsCI/CD pipeline

RailwayBackend hosting

VercelFrontend hosting

SentryError monitoring

StripePayment processing

PostgreSQLDatabase

CloudflareDNS & security

THE NUMBERS

Every Token Counted

We don't just use AI — we measure it. Every AI feature includes cost-per-call tracking, model selection optimization, and budget alerts.

When we migrated our question generator from Sonnet to Haiku 4.5, we cut costs 56% without losing quality. That's not luck — it's engineering discipline applied to AI operations.

$0.02

Per student/month

56%

Cost reduction

Response time

cost-analysis.log

# AI Cost Comparison — Question Generator

Before (Sonnet 4.5):

Cost/call: $0.0045 | Latency: 12.5s

After (Haiku 4.5):

Cost/call: $0.0020 | Latency: 5.0s

Savings: 56% | Quality: equivalent

See the Results →