Asseto Corsa Reinforcement Learning banner

Asseto Corsa Reinforcement Learning

15 devlogs
59h 20m 28s

A reinforcement learning agent (using the SAC algorithm) learns to drive a Formula 1 car around the Monaco GP circuit in Assetto Corsa. The agent controls steering, acceleration, and braking by interacting with the track, receiving feedback, and i…

A reinforcement learning agent (using the SAC algorithm) learns to drive a Formula 1 car around the Monaco GP circuit in Assetto Corsa. The agent controls steering, acceleration, and braking by interacting with the track, receiving feedback, and improving over time.

This project uses AI

GitHub Copilot for code completions, algorithm implementation checks (unfortunately, I do not have a degree in mathematics), and documentation generation (including docstrings).

Demo Repository

Loading README...

ved patel

Ok, now it’s working. At least, somewhat.

Finally, I implemented Torch multiprocessing. Essentially, the “learning” is now async from “collecting.” The CPU is always running the model and playing actions on the AC environment, while the GPU is constantly updating the weights based on environment observations. This was SUCH A PAIN to set up!!!!! (Look at the diagram for a better visualization; in the diagram’s case, sampling would be done at the same time as training.)

TORCHRL NEEDS BETTER SUPPORT FOR MULTIPROCESSING!!!!

Anyway, rewards are going up. So, there’s that. Hopefully, this is the run. pleaspelesapselsaeplesapelsas

Attachment
Attachment
1

Comments

ved patel
ved patel 16 days ago

UPDATE: rewards have reached 399! It’ll hopefully only be a few weeks until this gets up and running!

ved patel

Ok, ok, IK the 10h looks crazy.

I REALLY, REALLY wanted a functional AI for this devlog, but I was unable to make that happen 😔.

Right now, it looks like the fundamental training script (train_core.py) is broken? It worked for car-racing, but it seems for AC it stopped working (prob because of a wack previous commit).

The fundamental problem I’m having is that it’s just not learning. I tried SAC-BC, but it seems subtle differences between recording, training, and testing (SIM2REAL, but SIM2SIM in this case?) are enough for the model to freak out. Most of the time, it cannot turn in time, and it turns less than needed.

I decided pure BC will not happen, against my wishes. Instead, I’ll be sacrificing my computer for ANOTHER 3 days (total compute is prob ~7 days by now. Does Flavortown give electricity grants?).

Anyway, when I tried with Monaco overnight, the rewards looked abysmal. Like it’s never been this bad before. It NEVER improved itself AT ALL and spiraled into a horrible, horrible policy. I think I fixed it. maybe.

Also, I got lazy trying to record ~30 laps of the track EACH TIME I WANT TO TRAIN BC. So I found a workaround using the Assetto Corsa AI to drive around for me. Essentially, I used the AC AI to drive around to record demonstrations, which I will train MY AI on. Wow.

The goal was to take this semi-garbage AI and train it online to make a good AI.

However, Monaco’s AI is kinda horrible and it kept driving into walls, so instead, I’m training on Brazil for now, cuz it’s an easier track for the model to learn AND the Brazil AI doesn’t drive into walls.

(I’ll upload the gameplay of this heinous, atrocious, awful, terrible, appalling, vile, detestable AI later, but it’s worse than u think)

PLEASE MAKE TS AI WORK!!!!!

I’VE NEVER TRIED THIS HARD TO MAKE SOMETHING WORK

PLEASEEEEEEEEEEEEEEEEEEEEEEEEEE

Attachment
0
ved patel

Turns out what I said in my previous devlog was wrong. Assetto Corsa automatically maps controller inputs exponentially rather than linearly, which was causing some of my issues. Make sure your controller settings match mine to avoid the same problem (I’ll add another vid in the docs to fix this).

I’m now using SAC-BC with this actor objective: E[Q(s,a) + H(π(·|s))] + λ · E[log π(a|s)

Currently training it and the results look promising so far. If this works out, I’ll move on to porting from offline RL to online RL.

Attachment
0
ved patel

We’re using CRSfD for RL with human demonstrations (mostly pre-learning from your laps). I’m still testing it; if it doesn’t work, I’ll probably try SAC(λ), but it’s not preferred since it’s more complex and I’d have to rewrite a lot.

I also updated the docs and made major changes to the CLI so it actually works this time. Previously, the scripts weren’t showing up on PyPI for some reason; that’s fixed now, so it should work.

(I’m not sure why the agent doesn’t turn on the video. It’s not an env->AC issue, the agent is just outputting extremely small steering values, and I’m not sure what’s going on.)

0
ved patel

I linked another repo to this project, the actual AC app, which communicates with my training script.

Essentially, it creates a socket between the training script and the game, allowing information to be passed back and forth. The app takes an input indicating whether to reset the environment and outputs data such as speed, velocity, position, tire temperatures, and current vehicle damage.

Attachment
0
ved patel

I fixed several issues for shipping. These changes included adding a CLI for training, loading, and whatever the hell else someone might need, plus updating the documentation to match.

I also started training in AC. It’s taking a long time, so I’ll need to improve sample efficiency, likely by using RLHF.

Attachment
0
ved patel

I created documentation using https://docusaurus.io/. The AC environment is almost complete; it’s now in the stress-testing phase until I’m confident it can run overnight (it will probably still break). Unfortunately, running multiple AC environments isn’t practical, AC doesn’t easily support multiple clients. The simplest workaround would be a virtual machine, but that requires too much compute, so we’re sticking with a single environment.

Attachment
0
ved patel

YAY! I finished car racing! Now I can start training with Assetto Corsa! The model works way better than expected. It’s a bit jerky right now, but that’s something I’ll fix later.

There was a big alpha issue that was stopping the model from learning anything useful, so I’m really glad I did this “test run” before moving to the 3D game.

I also saw a research paper on learning in 3D games where you first train the model to maximize pixel-wise change in a specific area, then train it on the actual reward. The researchers found this works much faster because the model already understands the 3D environment before reward training. I’ll most likely try this.

Right now, I’m working on building the Gymnasium environment for Assetto Corsa and figuring out the reward function, since this one has to be custom. Ideally, I’ll have training running on AC by Sunday, we’ll see!

Attachment
0
ved patel

WOW! Great progress. The agent is LEARNING!

I used a pretrained VAE for the encoder, training it to compress and decompress the game frames. The most compressed layer is halved, and the encoder output is fed as input to the model.

Next, I’ll try a target VAE, because the current setup makes it static, which prevents the model from updating the encoder’s parameters. This will hopefully give performance of 800+ reward, but still, we have excellent progress with ~250 reward!

Attachment
0
ved patel

Ok, so it didn’t work yet. The model isn’t improving as fast as I want it to, but there’s a lot of promise in the new runs. Rewards are now increasing consistently since I fixed those pesky bugs.

I also made some progress on the actual Assetto Corsa app, which is in a separate GitHub repo: https://github.com/ved-patel226/AssetoCorsaRL-APP.

Currently, the app opens a socket and allows the user to reset the car to the track. I still need to figure out how to run multiple instances and, if possible, speed up the game tick rate, ideally by 9×.

Attachment
1

Comments

ved patel
ved patel 2 months ago

Update: we just broke the policy break-even point for the first time (positive rewards!)

ved patel

I completely refactored the codebase to make it easier to port to Assetto Corsa when the time comes, and I also started training on what is hopefully the final CarRacing model. If this goes well, I can start working on Assetto Corsa.

While it is training, I’m starting work on the Assetto Corsa app that will provide telemetry for me and the AI, and allow the car to be reset.

Track the run in the new Weights & Biases report: https://api.wandb.ai/links/ved-patel226-/uh789qod

Attachment
0
ved patel

Training started working! However rewards are going down😞.

I implemented noisy layers and fixed those nasty environment bugs.

Parallel environments work now!!

Attachment
0
ved patel

Instead of trying to write my own RL algorithms from scratch (which wasn’t fun), I’m pivoting to using TorchRL. It still gives me a lot of flexibility with my environments and isn’t as constricting as something like Stable Baselines.

I’m still getting this nasty error, though, and SAC has little to no examples. So I’ll have to rawdog it and hope for the best.

We’re still not in the Assetto Corsa realm yet. I’ll train a good model on CarRacing first, which is a simpler 2D version of racing, then try to port it to the almighty.

Heres the progress on CarRacing so far:

1

Comments

ved patel
ved patel 3 months ago

EDIT: working on parallel environments now! (this is the hard part)