ML bunch

Pragnyan Ramtha Adapa worked on ML bunch

about 5 hours ago

2h 26m logged

Oh, this is a good one, and a recent one too.

Artificial Intelligence Mathematical Olympiad is a giant. Rat race of people trying to make a model that is really good at solving math problems and the first person to solve. Or to make an air model solve? 47 out of 50 problems in a set. Will get like 250k dollars. Which is crazy. Considering some people are just Vibecoding over here.

I have made over, 50 submissions over two months. Experimented a lot. And learnt a lot.

I’ll keep it short over what I’ve done. Firstly. I was hoping. To fine tune a model. So that it can learn the way of mathematical solving. And apply it in The competition. But that didn’t work. It kept hallucinating stuff.

Then I tried to use already sophisticated model. And give it tools and that failed too, because apparently you need to give these models a specific type of tool in a specific way, and only that would work. So then I went ahead and trained the model on how to actually use these tools. So that I don’t eat up into its context window. And context windows basically like the limit of how much a. Yeah, I’m always gonna remember at once. And basically if we make it remember a lot at once, it is going to hallucinate and it’s gonna eat up a lot of Vram.

Currently a lot of people are using something known as. GPT OSS 120B. And it’s really good language model that’s. Is able to solve a lot of mathematical problems. And I would say that. I’m definitely heading in the right direction. The competition isn’t over yet. But there is still a chance to do more.

Either way, I can’t officially participate. Because I am under 18. That’s why I’m. Blogging over here. If I could participate next year, I definitely would go. Go for the win.

0

1

Log in to leave a comment

Comments

Pragnyan Ramtha Adapa about 4 hours ago

this is still on ongoing

Pragnyan Ramtha Adapa worked on ML bunch

about 5 hours ago

4h 41m logged

This was one of the biggest projects I’ve worked on. It took me two months just to make it run. I had to switch between three platforms to build it, and finally, at last, on the bleeding edge, it was possible: I had cloned myself into an AI model.

This is going to be a mega-thread, so only read if you’re interested!

I got the idea while learning about LoRA (Low-Rank Adaptation), which allows people to fine-tune AI models without needing a massive data center. You can use simple, small GPUs to fine-tune a decent model on any data you like. I wondered, “What if I make an LLM that clones my personality?” I wanted it to talk and feel like me. To get the data, I used an automation scraper to pull my WhatsApp and Instagram chats and converted them into JSONL format. This allowed me to train in epochs and batches, using CUDA cores to parallelize the process.

I started fine-tuning with normal PyTorch, but it was a mistake, it required 80 GB of VRAM. Even with DeepSpeed, I couldn’t get it under 40 GB. I almost gave up, but then I researched QLoRA. By quantizing the model (reducing the precision of the numbers), I could run it on lower-grade hardware. After 20 days of fiddling with scripts, I finally got it to work.

Then came the hardware struggle. Kaggle’s 20 GB of space wasn’t enough to fuse the 100 MB adapters back into the 16 GB model because the process inflates the file size. My 2014 laptop couldn’t handle it either. Finally, I found a platform with enough storage to fuse the files and run them on Colab.

Does it run? Yes. Did it work at first? No. It sounded like a traumatized student repeating rote-learned phrases. I realized the AI didn’t understand my conversational flow. To fix this, I used sequential learning, organizing the data by time zones (morning, afternoon, night) to give the model context.

I redid the process using Llama 3 8B, and it’s finally stable. If you’re interested in the details, DM me on Slack: @pragnyanramtha.

0

1

Log in to leave a comment

Comments

Pragnyan Ramtha Adapa about 5 hours ago

Full version cause, Flavortown won’t allow me to add. Everything here.

This was one of the biggest projects I’ve worked on. It took me two months just to make it run. I had to switch between three platforms to build it, and finally, at last, on the bleeding edge, it was possible: I had cloned myself into an AI model.

This is going to be a mega-thread, so only read if you’re interested!

I got the idea while learning about LoRA (Low-Rank Adaptation), which allows people to fine-tune AI models without needing a massive data center. You can use simple, small GPUs to fine-tune a decent model on any data you like. I wondered, “What if I make an LLM that clones my personality?” I wanted it to talk and feel like me. To get the data, I used an automation scraper to pull my WhatsApp and Instagram chats and converted them into JSONL format for the training stack. This allowed me to train in epochs and batches, using CUDA cores to parallelize the process for efficiency.

I started my first round of fine-tuning with normal PyTorch, but that was a huge mistake. It required 70–80 GB of VRAM, while the free limits are usually 16 or 32 GB. I turned to DeepSpeed, but even then, it needed 40–50 GB. I almost gave up, but I came back the next day and researched how to reduce VRAM usage. I read about LoRA again and tried training special sub-matrices, but the gradient activations still took 16 GB.

Eventually, I discovered quantization. By removing some precision from the decimal values in the model, it runs much better on lower-grade hardware. I used QLoRA, which quantizes the model before training. After 20 days of fiddling with scripts and overcoming my lack of knowledge regarding the Torch ecosystem, I finally got it to run.

Then came the hardware struggle. I found the LoRA adapters were only 100 MB, but to fuse them back into the main 16 GB model, Kaggle’s 20 GB of working space wasn’t enough because you essentially have to copy the model weights during the process. My 2014 laptop couldn’t handle it either. I tried Lightning Studio but burned through my credits instantly. Finally, I found Model Compute, which gave me the storage I needed to fuse the files and run them on Colab.

Does it run? Yes. Did it work at first? Absolutely not. It sounded like a traumatized student just repeating rote-learned phrases. I realized the AI was seeing random paragraphs and failing to predict my “smooth” conversational style. To fix this, I used sequential learning. I organized the data chronologically and divided batches into conversation pieces based on time zones (morning, afternoon, night) so the model had context.

I went through the process again using Llama 3 8B, which was much more stable than Mistral 7B. After a massive amount of troubleshooting, I finally made it work. If you’re interested in the technical details, DM me on Slack: @pragnyanramtha.

Pragnyan Ramtha Adapa worked on ML bunch

about 6 hours ago

0h 50m logged

This is a bit old, but Shell, the petroleum company.

Hosted a machine learning hackathon against all odds but named it as an AI hackathon?

This code achieved 93% accuracy in that!

Basically, the data was really messy and it consisted of like ten different mixtures. We had to figure out which mixture would give us the highest efficiency, and the only way of knowing that was through burning them. They had an actual validation data set that they got recorded, and we had to guess, so we burned through many API credits and then finally found it. I was placed like 46th in that hackathon, I guess.

0

Log in to leave a comment

Pragnyan Ramtha Adapa worked on ML bunch

about 6 hours ago

1h 39m logged

Who likes space?

This was one of the most interesting projects that I had done.
It was for a neurIPS program, which is like one of the biggest publishing services out there. Though I was told by some people to not participate because of the age restrictions.

So I needed to drop out in the middle, though I had achieved one of the top 50 places there.

1

0

Log in to leave a comment

Pragnyan Ramtha Adapa worked on ML bunch

about 6 hours ago

2h 12m logged

Editing how many people have bank accounts in Africa and finding ways to increase that amount

This was a big ride, and this was the first time I visited on Zindi, and I checked out many of the projects.

This was also the first time I got into competitive machine learning, which is like a different field of machine learning compared to classical, where everything is slow and systematic. Whereas in here, you need to rush everything to get ahead of the competition. Most of this space is ruled by tree-based algorithms and tabular frameworks that give you immediate results.

Efficiency is the thing here.

0

Log in to leave a comment

Pragnyan Ramtha Adapa worked on ML bunch

about 7 hours ago

1h 4m logged

Some of the other mini projects that I’ve done but failed miserably

gambling with machine learning?

Basically, this was a project that I had done to predict the stock market, however, it failed horribly. And the noise was terrible.
Humans are truly random. No AI model can ever predict

0

Log in to leave a comment

Pragnyan Ramtha Adapa worked on ML bunch

about 7 hours ago

1h 8m logged

Unlocking Societal Trends in Aadhaar Enrolment and Updates Anomaly Detection & Fraud Risk Prediction A Data-Driven Approach to Identify Suspicious Patterns

research paper

i got some data from the Indian government, and they told us that i can do anything we want with it, so i went through the data and got this.

I highly suggest that you go through the paper. It took me many weeks to make it. 😭

0

Log in to leave a comment

Pragnyan Ramtha Adapa worked on ML bunch

about 7 hours ago

0h 28m logged

Manga Colorization (still fetching data)

link : here is the github project

ML pipeline for colorizing black & white manga panels using reference-based style transfer. Provide a B&W panel and a colored reference image, the model instantly colorizes it.

Stack: Stable Diffusion 1.5 + ControlNet (lineart) + IP-Adapter (color transfer) Requirements: NVIDIA GPU with 8GB+ VRAM (tested on RTX 4060), Python 3.11+

0

Log in to leave a comment

0 Followers

Comments

Comments

Unlocking Societal Trends in Aadhaar Enrolment and Updates Anomaly Detection & Fraud Risk Prediction A Data-Driven Approach to Identify Suspicious Patterns