Vector search
Finding the K most semantically similar pieces of text by comparing embedding vectors. The "find nearest neighbors in 1,536-dimensional space" primitive.
Hybrid search
BM25 (keyword) plus vector (semantic) search, blended. Each catches what the other misses. The 2026 production default.
Chunking strategies
Fixed-token vs semantic vs layout-aware vs hierarchical. Overlap, units, and why chunking dominates RAG quality more than any other knob.
Reranking
Cross-encoder rerankers (Cohere Rerank, BGE, voyage-rerank). The 'cheap retrieval -> expensive rerank' pattern that wins production RAG.
RAG basics
Retrieval-Augmented Generation — handing the model relevant documents at query time so it can answer from real data instead of guessing.
Memory
Giving an LLM continuity across conversations — short-term, long-term, episodic, and the patterns that actually work in production.