<aside> 📌

Notes from implementing GPT from scratch following Karpathy's tutorial, with extra annotations on shapes and intuitions.

</aside>

Notebook + Code


Google Colab

https://github.com/archit-manek/gpt_scratch

Data Loading


GPT Data Batching & Shapes


1. Tensors vs. Python Lists

# Don't loop over the tensor directly
decode(out.tolist())

2. Key Dimensions (B, T, C)

In Mechanical Interpretability, always track the shapes: