Skip to main content

Part I — From Zero

Ten stages. Each one ends with an artifact — build it before moving on, or none of this sticks.

The whole arc is 3–6 months of part-time effort. If a stage feels easy, skim it and do the artifact to verify. If a stage feels hard, slow down — there is no shortcut around understanding what chat.completions.create returns.

The compounding order

Each stage assumes everything before it. Structured output before tool calling. Tool calling before RAG-with-tools. Evals before agents. Skipping is the most common way new AI engineers stall — they grind on agents they're not ready to debug and conclude "AI is unreliable." (It is. The point is to build the discipline that makes it reliable enough to ship.)

The ten stages

StageTopicTimeArtifact
0Setup~1 dayAPI key, working SDK, three verification scripts
1First API call~half a dayA CLI tool that asks the model anything and prints the response (with token counts)
2Streaming chatbot~1 weekA web chat UI that streams tokens with conversation history
3Structured output~3–5 daysA typed extractor — paste an email, get back {category, priority, summary} validated by schema
4Tool calling~1 weekA 2–3 tool assistant where the model picks the right function
5RAG over your docs~2 weeksA RAG system over your own notes/PDFs, with citations
6Your first eval set~1 week50–100 case eval suite for the RAG, with deterministic + LLM-judge checks
7Observability~3–5 daysEvery LLM call logged, traced, cost-attributed
8A simple agent~1–2 weeksA single-agent loop with iteration caps, tool budgets, and human-in-the-loop on writes
9Ship it~1 weekOne project deployed to a URL, with auth, rate limits, cost caps, and a status page

Start with Stage 0