LLM From Scratch

Build a language model from first principles. This tutorial takes you from basic tensor operations to a working code-generating model.

Modules

Module	Topic	Description
00	Introduction	What is a language model?
01	Tensors	Shapes, broadcasting, operations
02	Autograd	Gradients, chain rule, backprop
03	Tokenization	BPE algorithm, vocabulary
04	Embeddings	Vector representations
05	Attention	Self-attention, multi-head
06	Transformer	Decoder blocks, layer norm
07	Training	Loss, optimizers, batching
08	Generation	Sampling strategies

Quick Start

# Install Quarto
brew install quarto

# Preview with live reload
quarto preview

# Or generate Jupyter notebooks
quarto render --to ipynb

---
title: "LLM From Scratch"
---

Build a language model from first principles. This tutorial takes you from basic tensor operations to a working code-generating model.

## Modules

| Module | Topic | Description |
|--------|-------|-------------|
| [00](modules/m00_intro/lesson.qmd) | **Introduction** | What is a language model? |
| [01](modules/m01_tensors/lesson.qmd) | Tensors | Shapes, broadcasting, operations |
| [02](modules/m02_autograd/lesson.qmd) | Autograd | Gradients, chain rule, backprop |
| [03](modules/m03_tokenization/lesson.qmd) | Tokenization | BPE algorithm, vocabulary |
| [04](modules/m04_embeddings/lesson.qmd) | Embeddings | Vector representations |
| [05](modules/m05_attention/lesson.qmd) | Attention | Self-attention, multi-head |
| [06](modules/m06_transformer/lesson.qmd) | Transformer | Decoder blocks, layer norm |
| [07](modules/m07_training/lesson.qmd) | Training | Loss, optimizers, batching |
| [08](modules/m08_generation/lesson.qmd) | Generation | Sampling strategies |

## Quick Start

```bash
# Install Quarto
brew install quarto

# Preview with live reload
quarto preview

# Or generate Jupyter notebooks
quarto render --to ipynb
```