A learning implementation of GPT-2 training from scratch (inspired by NanoGPT) on WikiText-103. This project demonstrates the ability to implement transformer architectures and set up modern ML ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results