Skip to main content

The model

How an LLM actually works under the hood — tokens, embeddings, the transformer, and inference.