Shipped this project!
A complete repository containing my development4 of miniGPT, nanoGPT and nano-MGPT coupled with usable examples and my trained models (trained on corbt/all-recipes) in the releases! Tested my knowledge on Transformer-architecture ML models, good refresher for my brain. Recipe type is good for basic training because of the predictable structure and is easy to evaluate with a larger LLM for hallucinations, so I suggest y’all who want to learn to start with corbt/all-recipes or roneneldan/TinyStories (a more generic one). I am satisfied with what turned out, my only regret being my local computer not being able to train a nano-MGPT so that is in my to do list to optimise. Next time I will begin training more complex LLMs, i.e. dedicated RPG-Type LLMs and to efficientise my structure, potentially turning away from generic GPT-type to a more compact or efficient Transformer-based architecture. <3 from 0681691