Shipped this project!
My first time making and training an LLM. Definitely fun to see it improve day by day, but I ran into performance issues due to its small size. I also wanted to give it other capabilities, but TensorFlow is somewhat limited in terms of what I can do with it. I learned a lot from this project.
I want to redo this project, but with a custom NN library and a better device when I have more time.