<aside> 📌

Notes from implementing GPT from scratch following Karpathy's tutorial, with extra annotations on shapes and intuitions.

</aside>

Notebook + Code

Data Loading

The Issue: PyTorch tensors (e.g., tensor(5)) are not the same as Python integers (5).
The Fix: Use .tolist() to extract values before passing them to standard Python functions (like string decoders).

# Don't loop over the tensor directly
decode(out.tolist())

In Mechanical Interpretability, always track the shapes: