Papers Worth Reading

In one line: Most AI papers don't matter for shipping AI. A short list of foundational ones gives you the conceptual vocabulary; the rest you skim only when they intersect a problem you're hitting.

In plain English

Here's a relieving secret: you can ship production AI for years without reading a single research paper. Most of the job is engineering — prompts, evals, retrieval, deployment — not research. But about ten foundational papers gave the field its shared vocabulary, and knowing them makes everything newer dramatically easier to skim. This page lists those ten, then shows you how to triage the endless rest in minutes instead of hours.

1. The honest claim

You can ship production AI for years without reading a single research paper. Most AI engineering is engineering — prompts, evals, retrieval, observability, deployment. The papers are interesting but rarely actionable.

That said, ~10 foundational papers give you the conceptual vocabulary that every newer paper references. Knowing them makes everything else easier to skim.

2. The foundational ten (read these, in this order)

Transformer architecture

Attention is All You Need (Vaswani et al., 2017) — the original transformer paper. The architecture every LLM is built on.

Scale and capability

Language Models are Few-Shot Learners (Brown et al., 2020 — the GPT-3 paper) — why scale alone produces emergent capabilities; the "in-context learning" concept.
Scaling Laws for Neural Language Models (Kaplan et al., 2020) — how loss scales with model size, dataset size, compute.

Alignment and instruction-following

Training Language Models to Follow Instructions with Human Feedback (Ouyang et al., 2022 — the InstructGPT paper) — RLHF, why "instruct"-tuned models behave like GPT-3.5 / ChatGPT.

Tools, agents, retrieval

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks (Lewis et al., 2020) — the original RAG paper.
ReAct: Synergizing Reasoning and Acting in Language Models (Yao et al., 2022) — the tool-use loop pattern.
Chain-of-Thought Prompting Elicits Reasoning (Wei et al., 2022) — why "let's think step by step" works.

Modern surveys and frames

A Survey of Large Language Models (Zhao et al., 2023+ updates) — a periodically-updated landscape view; skim the latest version.
The Bitter Lesson (Sutton, 2019 — an essay, not a paper) — why general methods that leverage compute beat domain-specific cleverness.
Building Effective Agents (Anthropic, 2024 — an engineering essay, not academic) — current best primer on agent patterns from a major lab.

Read these and you have the vocabulary to follow any newer paper. ~30 hours of reading total.

3. How to actually read a paper

You don't have to read papers like a textbook. The triage method:

90% of papers stop at "bookmark + move on." That's correct. The remaining 10% you actually use, you test before adopting.

4. The categories worth tracking

You don't need every paper, but knowing the categories helps:

Category	Cadence	Why
Frontier model technical reports	Per release	What's new in capability
RAG / retrieval methods	Monthly	This area is moving fast
Agent / planning architectures	Monthly	Same
Eval methodology	Quarterly	Slowly improving
Prompting techniques	Quarterly	Mostly diminishing returns
Long-context tricks	Quarterly	When you hit context limits
Safety / alignment	Quarterly	Slow but important
Quantization / efficient inference	If you self-host	Only if relevant

5. Where to find the worthwhile ones

arXiv cs.CL and cs.LG — the firehose. Use it via a curated filter, not directly.
Papers with Code — adds the "reproducible?" signal.
Latent Space podcast — weekly summary by people who read more than you can.
The Sequence / Import AI / The Batch — newsletters with paper digests.
AlphaSignal — daily AI papers + a brief, decent signal-to-noise.
Twitter / X lists (see Part IV-1) — researchers post their own papers; aggregators retweet the important ones.

6. The lab blog posts are often better than the papers

For practical engineering, the major labs' engineering blog posts are often higher signal than their research papers:

Anthropic news / research — usually framed for engineers.
OpenAI cookbook — runnable patterns.
Google DeepMind blog — research + engineering posts.
Mistral docs / cookbook — concise, practical.

Engineering posts tell you "here's how to use this in your app." Papers tell you "here's why this exists." For shipping, the first is usually what you need.

7. The papers worth re-reading

A few papers reward re-reading at different career stages:

Attention is All You Need — at year 0 (architecture overview), at year 2 (multi-head attention details), at year 4 (positional encoding choices).
Scaling Laws — at year 0 (the existence of the laws), at year 2 (why frontier-tier costs what it does).
The Bitter Lesson — annually. It's short, and the lesson is unintuitive enough that re-reading recalibrates your priors.

8. When NOT to read papers

Specifically, don't:

Read a paper to "stay current" if it doesn't address something you're building. The cost of context-switching to academic prose outweighs the benefit.
Read 12 RAG papers before building your first RAG. Build first. Read after.
Read a paper to refute someone on Twitter. The expected ROI is negative.

9. The "what would I cite?" test

Useful self-check: in a technical discussion with another AI engineer, would you actually cite this paper to make a point? If not, the paper wasn't worth your time. If yes, you remember it; you internalized it.

Most papers fail this test. The foundational ten pass it constantly.

10. The bibliography habit

Keep a simple ~/notes/papers-read.md — title, link, one-sentence takeaway, date.

After two years you have:

A scannable list of what you've read.
A reference when you need to cite something.
A growth artifact — early entries look unsophisticated; that's progress.

Common mistakes

Where people commonly trip up

Trying to read everything new on arXiv. The volume is unsurvivable; ~85% of papers are noise. Filter ruthlessly.
Reading papers without building. You "understand" RAG from a paper but have never built one. Building is the test of understanding.
Treating engineering blog posts as inferior. For practical AI engineering, lab blog posts often beat papers — they're written for engineers, not reviewers.
Skipping the foundational ten. Reading newer papers without the foundations is reading a sequel you haven't read the original of.
Reading to "look smart." If the only audience for your reading is yourself-pretending-to-be-impressive, skip it. Read what you'll use.

🤔 Quick checkQuick check

→ Next: Communities and conferences — where production AI engineers actually congregate.

1. The honest claim​

2. The foundational ten (read these, in this order)​

Transformer architecture​

Scale and capability​

Alignment and instruction-following​

Tools, agents, retrieval​

Modern surveys and frames​

3. How to actually read a paper​

4. The categories worth tracking​

5. Where to find the worthwhile ones​

6. The lab blog posts are often better than the papers​

7. The papers worth re-reading​

8. When NOT to read papers​

9. The "what would I cite?" test​

10. The bibliography habit​

Common mistakes​