Re-implementing transformers from scratch (and failing miserably at it 😢)
AI was not used in to generate to code itself but i used it to learn concepts and clarify my doubts.
Other than that 100 % of the other code IS WRITTEN BY ME.
Re-implementing transformers from scratch (and failing miserably at it 😢)
AI was not used in to generate to code itself but i used it to learn concepts and clarify my doubts.
Other than that 100 % of the other code IS WRITTEN BY ME.
finally coded up the attention part in “attention is all you need paper”
how was the experience ? awful
how many times did it fail ? near infinity,
but finally it work as the matplotlib graphs were showing up correctly.
tho i havn’t uploaded anything yet, but ill make all of this public soon ! 😭
but since i’m done with this, im thinking to build with somthing only via this library,
i’m thinking to build an sdk of kind. so others can just use it, also i wanna also train the model themselves without that much compute, ideas are invited.
Log in to leave a comment
i’m converting this from a scratch notebook to an actual library !
i’m implementing the encoder and decoder blocks and its soo much harder to actually implement when not following the guides 😭.
either way, i can’t wait to convert this full blown package. i’m thinking to implement the perceptron next.
Log in to leave a comment
previously, i implemented transformers from scratch with the help of Andrey karapathy and the atttention is all you need paper.
although i learnt a lot from this , i necessarily haven’t made anything useful.
hence i wanna make a python library that anyone can import and use to experiments around with LLMs.
this will mainly be a library of different architectures i wanna implement manually to actually get experience.
Log in to leave a comment