A learning implementation of GPT-2 training from scratch (inspired by NanoGPT) on WikiText-103. This project demonstrates the ability to implement transformer architectures and set up modern ML ...