Silver Surfer(Godot)  + AI Reinforcement Learning DQN Model + Rule Based Script bot + Hybrid Python Agent banner

Silver Surfer(Godot) + AI Reinforcement Learning DQN Model + Rule Based Script bot + Hybrid Python Agent

24 devlogs
90h 46m 17s

ME MAKE 2D SPACE GAME, ME BAD AT GAME, ME USE AI BEAT GAME
(。•̀ ‸ •́。)

This project uses AI

I used co-pilot to help in debugging code and in setting up the python RL model file and help in creating the UI

Demo Repository

Loading README...

Dhruv J

Finally IM DONE WITH EVERYTHING feels like a hell of a journey that im proud of i mean i did more than i asked I CAME, I SAW ,I CONQUERED this is my last and final devlog im so grateful i could post this journey even though no one has prolly seen them it was still fun posting them well i got the github pages up and running and ive written and shown the demos in the github wiki but even though as disappointing of the failure of my DQN model literally the whole premise of this project i came with 2 other work-arounds which lowkey smashed my expectations.

0
Dhruv J

FINALLY this looks EPIC now i gotta Upload all my data like the model file,code,video_demo,csv_graph to github wiki pages to actually demo the models and i will have to explain the model working what it does blah blah and its still a lot to do and i just want to get this project over with ASAP so yeah ive completed the main page i added all the featrues and even glowing edges of the bento boxes when hovered over by the cursor

Attachment
0
Dhruv J

Just deployed my game to itch.io and made huge changes to my main page with the cursor by making it a circle with gaussian blur within it to give it a glass look and added smooth ease transitions upon hovering over bento boxes. But im facing major performance issues with the game on itch.io which uses a lower end renderer Compatibilitywhich only supports WebGLand thus it’s really low quality compared to using Forward+,and the game lags severely with extremely low frames per seconds on itch.io and im too tired to even try optimizing it

Attachment
0
Dhruv J

Well uhm it took 30 mins to create the Pause screen with functioning buttons Pause and Quit but then it took 30 MORE MINS TO SET UP THE BACKGROUND BLUR? well tbf its my first time doing this and i never knew for making simple textures requires script knowledge in the shader editor anyways i followed a tutorial and i came across a bug I MEAN FEATURE its that because im using physics based movment for the player when the Engine.time_scale=0it causes the player to randomly drift but im too lazy to figure out the exact reason its a feature anyways and youre in space so ofc you gon still move if u stop accelerating so ill take that as a feature and who pauses an Endless runner? well idk but i had to come up with a pause menu anyways

Attachment
0
Dhruv J

i used Figma then switched to Canva to designing and matching this canva design in HTML and CSS but i currently feel it’s too plain for the main page so i was thinking of simple bento boxes and with a new mouse cursor which ive seen in multiple portfolio websites and it looks really beautiful by simply replacing the ugly cursor with a circle and ive also worked on post processing of the base game with addition to difficulty tweaking

Attachment
0
Dhruv J

so i made a hybrid agent combines behaviour cloning (replaying the best run) with targeted exploration (trying different actions only where it previously died). A rule‑based heuristic takes over after the explore window. This guarantees monotonic improvement after every episode either matches or beats the previous best score nd efficiently learns to survive increasingly faster game speeds it has certain components like death_frame where it traces back 15 frames from its last best run and tries a new action and thus gets a linear improvement as it improves from each death (i froze my env because it was a pain) so in this graph RED represnts a ‘Rule-based script bot’ and Purlple represents my ‘hybrid agent’ that i just worked on and yellow represents my initial ‘RL DQN’ model which never really learnt even after 5500 episodes(requires more)

Attachment
0
Dhruv J

H̸͇̰̐̿͛̋́̒̈͒̌̊̔̀͝͠ͅĘ̵̰͉̩͎͇̞͕̼̞̪̳͚̠͑͊̑̾́̅̀̉̒̌̆̂̈́͂̆L̴̢̗̬̣̻̦̫̠̦͉̜̳͉̬̗̱̳͊̐̔̒͗P̶̧̱͉̟̻̳̺̜͉͎̬̱̈́́̕̚ͅͅ ̷̱̠͉͉̼̯̪̙̹̮̠̪̐́̈́́̆̀͂̊͠M̴̡̨̞̮̞̝͊͗̊͋̑͜E̶͙̿̍͋̌̉̎́͑̊͊̇͊̏ ̸̠͔̍̒̈́͗͂͘T̶̹̫̭͋̈́̑̈̅̕͘͝H̴͕͕̆̋̉͂͑͊͌̂̂͛̔͐͑́̔͊͘͘͜͠Ȩ̸̨̡̨̛͇͙̟̟͕̫͉̩̣͔̯̺͉̲̇́͆̽̀̊̅̒͝ͅ ̷̛͕͉̲͉̯̗̹̮̭̃̔͑̈̐̈́̇̀̾͌̄̊̈̓͝ͅV̶̡͙̑͑̈͆͛̋̌̍̈̅́̈́͝O̵̩͂̓̈́ͅĮ̶̹̖̭̙̺̘̞̜̞͙͑͋̋̎͒̒̎͂͝C̸̨̡͓͇̹̒͋͋̈͆̏̒͗̐͊̌͊̈̕E̵͚̜͉͒͌̋͘͘S̸̢̫̬͓̭̪̦̱̝̜̟̩͓͕̰̕͜ͅ ̸̱͑͝Ī̷͔̲̣͕͙̜͊̑̿̃̂̍͋̈́͌͂̿̋͋̕̚͘͝Ņ̶̥͍͈̖̳̮̪͓̤̗͐̽̽́͋̌͐͗̈́̎͘͘ ̴̨͉̰̞͚̗̻̰̪̟̱͗̃̾̇̔́̂̋̾̀̉̒͆̂̈́͝͝ͅͅͅM̴̮̾͗̈́̒̀̈́͂̋̚͠Y̶̛̘͈͖͙̬͈̦̗͊̄́́͒̅̃̕͜͝ͅͅ ̸̞̥̤̦̱͔͕̪͑̀̌̑̇̐͂͒̂̄͂̔͗͂̉͝͝H̵̛͍̞̝̮̳̼͓̩͂̂̐̑̄͒̑͗̃͘͝͝E̶͓̩̹̱̿̎̈́̇A̴̹͛̀̒͋̔̾̇͗̉̓̂͌͗́͊̓͘Ḑ̴̯̪̺͇̞̯̠̦̘̺͖͖̂̉̓̉͆̔̒́̅͠ͅͅ i can’t seem to get my DQN model to perform i trained for 1000+ more episodes still to no avail im guessing my learning signal is still poor and the reward feedback isn’t strong enough but ive tweaked that for hours and it barely seems to help so now im just gonna demonstrate whatever ive tried i think the episodes aren’t enough but i have no time to train like 10,000 episodes. whats the graph u may ask? idk it’s a mystery

Attachment
0
Dhruv J

the burn out is crazy im done im throwing in the towel i cant seem to get my RL bot to train and not plateau even after 5000+ episodes like 10 points increase? are you insane and even when im tryna use chat -gpt to help me its just a futile attempt because chat gpt keeps making syntatic mistakes and overconfidently comes up with ideas that set back my project and it feels like ive been going in circle of training and debugging and only regressed so far guess i gotta burn the night away locked in

Attachment
2

Comments

bartoszkasyna
bartoszkasyna 5 days ago

basic reinformcent learning algorithm requires large amount of epochs. 5000 is definitely too little. If i were You, i would rent some good GPU and train on it.

Dhruv J
Dhruv J 5 days ago

really? man i cant do all that besides i made a script bot that does way better just bummed that i spent like 40 HOURS on a python DQN model and also its learning plateaued so thats something on my side

Dhruv J

Even though i trained my model for atleast 10-11 hours (hackatime dosent track it at all) and i look at the results and see the learning has plateaued i just realised a RL model is just not cut out for this kind of a task its been hell of a time to try learn how to do it and i ultimately gotta use another model that is more suitable and also use a frozen environment

Attachment
0
Dhruv J

Well i have been completely reconfiguring trying to use the previous python model but use it with the updated environment and the thing is its really hard to train a blind(i gave it a poor reward system) model with a procedurally changing environment cause rather than learn the obstacle locations its supposed to learn patterns and also so yeah i spent most of my time crashing out because my tcp connection was working but i wasnt able to figure it out even when using co-pilot to diagnose it anyways much while later i figured out the handshake wasnt stable and start training my model for 300+ episodes only to see my avg10 points had increased from 30 to around 40 points which is no where close to my 1000+ points target so now im back to reconfiguring the model

Attachment
0
Dhruv J

I had to start back at square one and identify why my model was doing poorly it took me a really long time till i figured out my environment is what caused problems the model would get it the player into “death traps” where there were no real escapes and the model wasn’t able to react in time and a combination of a bunch of other things caused it to be bottlenecked so anyways i thought of a script bot instead which would in theory be almost unbeatable and after a lot of tweaking of the obstacle tweaking and making the reaction time and obstacle spacing dynamically change as the game got faster and harder it reached my desired threshold of +1000 points which wouldnt be possible with a player i think idk ive reached like 500 points max well now i gotta re do all my hardwork for my RL model

Attachment
0
Dhruv J

Well everything that could go wrong went wrong, first of all my earlier model was stuck it did really well but it couldnt score past 350 points at some time it died at the first obstacle i tried to tweak the epsilon,reaction,time and engine time scale to increase my episodes and atleast try get my model to 500+ points which my personal high score but then even when i gave it certain rules it became a bit more consistent but still didn’t reach the 500+ threshold so i realised my combination of the model, the variables and mismatch in TCP communication caused all this and i tried using a different model (DQN) which did way worse but it was a true RL model but my combination of variables and communication timing screwed it and worst of all hours spent in training were wasted and it wasnt tracked by hackatime

Attachment
0
Dhruv J

I made HUGE PROGRESS with the python RL algorithm but I really was clueless with how to set it up so I used co-pilot to explain me and code me the python file responsible for sending inputs to GODOT because i was clueless in understanding the math happening and what epsilon was, and the terminal outputs of GODOT are in the image attached which are the scores of the AI which was really good after like 30 steps and I realised I hadn’t been saving my model properly throughout and every time I killed terminal it threw my progress tbf the model is practically done it just needs a lot of training and once the game is done its smooth sailing all i gotta do is create a website to demonstrate both the game and the AI playing which will take long but it’s easy doing that the hard part is over now .

Attachment
0
Dhruv J

It’s going great so far 🤤ive managed to setup a TCP connection b/w Python and Godot locally and more importantly made the Godot main and obstacle script update and send packets of variables such as obstacles present in lanes and current lane locations and its seemed to work with my python file being able to output random player inputs to Godot like a “Random AI agent” per say it’s just a script bot that randomizes which lane its gonna change into.

Attachment
0