Part 12: Comparison

Solo, startup, and enterprise AI work, side-by-side.

In one line: Same primitives — different practices. Whether a prompt change ships in 90 seconds or 90 days depends entirely on which column you're in.

In plain English

Chapters 9, 10, and 11 each built out one column in detail. This chapter is the reference card that puts those three columns next to each other so you can see the shape of the differences at a glance.

The big takeaway in advance: there is no single "correct" way to build with LLMs. Promptfoo-in-CI is the right eval strategy for a solo builder and a disaster for a regulated bank. A prompt-registry-with-approval-gates is right for the bank and an extinction-level event for the solo builder.

If you're choosing a workflow: find your column, copy that row, ignore the others until your scale catches up.

If you only remember one thing: the patterns (RAG, tool use, agents, evals) are scale-invariant. The process around them changes dramatically. Don't copy a FAANG AI org's MLOps stack for your weekend project, and don't bring a weekend stack into a bank.

At a glance

Dimension	Solo / Indie	Startup AI Team	Enterprise AI
Team	1 person	1–10 AI engineers	Platform + feature teams + AI CoE
Models	Frontier API direct	2 providers via gateway	Private endpoint / hybrid / self-hosted
Eval discipline	Promptfoo script	Hosted platform, CI-gated	Platform + prompt registry + audit trail
Observability	Free tier of one tool	Dedicated tool + dashboards	Centralized obs + corporate SIEM
Vector DB	pgvector	pgvector → Pinecone / Qdrant	Pinecone enterprise / Vespa / OpenSearch
Cost discipline	Hard cap in dashboard	Per-tenant dashboards	Forecast + chargeback by team
Governance	None	Light prompt review	Full model + prompt + risk registry
Compliance	None	SOC 2 once asked	SOC 2 + HIPAA + EU AI Act + sector regs
Deployment	Vercel / Modal	Vercel + Modal + gateway	Bedrock / Azure / Vertex + on-prem
Time to ship a prompt change	90 seconds	30 minutes	1–6 weeks
Time to ship a new feature	Hours to days	2-week cycle	1–3 months
Time to procure a new model vendor	Sign up with card	Hours to days	3–9 months

Where you sit

The transitions are gradual. You don't wake up one morning and suddenly need a model registry, an internal LLM gateway, or a prompt review board. Adopt each practice when its cost is justified by your scale — not before, and not after a near-miss forces it on you in panic mode.

How this chapter is organized

Each page below isolates one dimension and runs it across all three columns.

Team and Process — Team shape, decision style, time from idea to live, stakeholder count per change.
Stack comparison — Layer-by-layer tooling at each scale.
Ops — On-call, kill switches, observability stack, incident severity.
Workflow comparison — How an AI change moves from idea to production.
Economics comparison — Order-of-magnitude costs at each scale.
Tradeoffs — The explicit tradeoffs and when to pick which workflow.
Checkpoint — Self-test on the differences.

What stays the same / what changes

Across all three columns:

Stays the same:

The core patterns: streaming, tool use, RAG, agent loops, structured output.
The discipline of evals before shipping anything to users.
The fact that prompts are code and belong in version control.
The reality that LLMs are non-deterministic and drift over time.

Changes dramatically:

Who is allowed to change a prompt and how many people sign off.
Where the model call physically runs and which provider's terms apply.
How fast a change reaches users — minutes vs. months.
How much paperwork exists between "I had an idea" and "users see it."

Wrapping up in advance

The biggest mistake at every scale is applying the wrong column's playbook:

Solo with enterprise process → nothing ships, the side project dies.
Startup with solo practices → first real outage exposes the team and they retrofit governance in panic mode.
Startup with enterprise practices → the 8-person AI team that requires a risk-tier review for every prompt tweak gets out-shipped by a 2-person competitor in six months.
Enterprise with startup practices → first regulatory or reputational hit removes someone's job.

The corollary is that column-jumping is the most expensive moment in an AI team's life. When you cross into the next column — first hire after going solo, first six-figure enterprise customer at a startup — the temptation is to either import the entire next column's playbook at once (over-correction, kills velocity) or pretend you're still in the previous column (under-correction, sets up the first big incident). The right move is selective import in order of risk-reduction-per-dollar; the rest of this chapter is the menu you pick from.

How to read this chapter

Each page below is a focused side-by-side on one dimension; pages are short, opinionated, and meant to be skim-able in 5 minutes each.
The 3-column table is the workhorse. Look at the row for your column, then glance at the others to calibrate what "too little" and "too much" look like for that dimension.
The :::tip[In plain English] callouts are the TL;DR if you're in a hurry.
The "common mistakes" sections at the bottom of each page are the highest-density part — that's where the chapter's opinion sits.

Skim this chapter when you need a quick mental model of how a specific dimension differs across scales. Read Chapter 13: Decision Frameworks when you need to actually pick.

→ Continue with Team and Process.

At a glance​

Where you sit​

How this chapter is organized​

What stays the same / what changes​

Wrapping up in advance​

How to read this chapter​