Transfomers banner

Transfomers

3 devlogs
14h 58m 58s

Re-implementing transformers from scratch (and failing miserably at it 😢)

This project uses AI

AI was not used in to generate to code itself but i used it to learn concepts and clarify my doubts.
Other than that 100 % of the other code IS WRITTEN BY ME.

Demo Repository

Loading README...

Pragnyan Ramtha Adapa

finally coded up the attention part in “attention is all you need paper”

how was the experience ? awful

how many times did it fail ? near infinity,

but finally it work as the matplotlib graphs were showing up correctly.

tho i havn’t uploaded anything yet, but ill make all of this public soon ! 😭

but since i’m done with this, im thinking to build with somthing only via this library,

i’m thinking to build an sdk of kind. so others can just use it, also i wanna also train the model themselves without that much compute, ideas are invited.

Attachment
Attachment
Attachment
Attachment
0
Pragnyan Ramtha Adapa

i’m converting this from a scratch notebook to an actual library !

i’m implementing the encoder and decoder blocks and its soo much harder to actually implement when not following the guides 😭.

either way, i can’t wait to convert this full blown package. i’m thinking to implement the perceptron next.

Attachment
Attachment
0
Pragnyan Ramtha Adapa

previously, i implemented transformers from scratch with the help of Andrey karapathy and the atttention is all you need paper.

although i learnt a lot from this , i necessarily haven’t made anything useful.

hence i wanna make a python library that anyone can import and use to experiments around with LLMs.

this will mainly be a library of different architectures i wanna implement manually to actually get experience.

Attachment
Attachment
0