AI KARATE SENSEI banner

AI KARATE SENSEI

10 devlogs
56h 55m 30s

This project is a karate ai, that can track your movements using media pip, looking at the positioning of your body and the angles of your arms and legs, it can track how well your stance is and your positioning, then it can correct you and help you improve.
(this my progress atm)

In the future, i hope to implement hardware to it, for example adding sensors and making the camera move around so you that it can detect your movements when you are moving around. Force sensors in your shoes and movement + acceleration sensors allow it to more accurately detect your positioning. Additionally, i will train my own model on data collected from experts for higher accuracy.

yibo

I am working on the inference for my project, using the LSTM model. Currently it will decide what action it is through the model, and the percentage of “correctness”. I am trying to make it work at the moment but it dosent seem to be working very well. at the moment i am generating feedback using simple if statements to see if the angles of joints are correct, similar to what i was doing before. However, i am having trouble seeing if it is my inference script that is wrong or the model thats not working, as it has trouble determining if the action im doing is really a punch or just idle. It will often misinterpret me sitting still testing as a punch that is 100 correct. Some feedback that i got recommended me to add a 30 frame buffer at the start but that didnt really work, also instead of automatically detecting movement i made it simpler for me by making it press spacebar to detect but that didnt really work either. For now i want to focus on testing wether its the models fault or mine so i will strip it down to its basic, try a different approach for the inference script and just check if the model works at all, if not i will have to re-code it.

Attachment
Attachment
Attachment
0
yibo

I finished writing the script for training my multi stance moving model, currently when you finish training the data using the data trainer, it formats the data into folders of each move (one for each time you trained it).
FIrstly, the training script opens up the DATA_PATH folder, it loads all the sequences from each indivudial action folder, for each sequence there are 30 frames and 132 features, it will assign each action a number to each action. Then it used 80% of the data to trian the model (it uses the adam optimiser, categorical crossentrophy loss and accuracy as the metric), i am training a LSTM model. The remaining 20% of data is used for testing, it tests for 50 epochs and each epoch it will shuffle the data. Then it saves the model and label mapping, ready to use.

Attachment
Attachment
Attachment
0
yibo

Now ive decided to implement multi - movement stances for my project, originally i tried to use a adapted version of my original mediapipe code to collect data for multiple frames of the stance, however it was having trouble knowing when to stop, and the format the data was collected in wouldn’t work properly if i was to train this. I researched methods of collecting moving data and training them and i decided to use a LSTM (long short term memory) model. Additionally, for the training script, i decided to make it just record for 30 frames, this way i didn’t have to worry about asking someone to stop or start the training bit and i didn’t have to try to code a auto-stop feature. For each frame, the script uses mediapipe to record 33 landmarks, each with 4 values, meaning 132 values per frame saved in a .npy file (in a folder for each individual punch) this way it is a lot easier for the model to open than a csv.
The landmarks are all normalised, so it will work better regardless of your height or position from the camera. Currently i am working on finishing the script for training the model and getting around 100 samples of punches to train the model on.

Attachment
Attachment
0
yibo

ADDED:
-multi stance capability
-collected data for new stance and trained it
-added spoken aloud feedback.
-added multi stance capability to the ai sensei
I added multi stance capability for the ai model trainer, the data collector and the final ‘sensei’.
It works by having a separate column in the datasheet for each different stance. The ai model training works similarly, using a random forest classifier. it will split the data into 80% training and 20% testing. For each stance it calculates the median values (ideal position) and the standard deviation, which is the amount of acceptable variation.
For the decider, it starts by loading the trained model, karate_model.pkl then it tries to load the stance_endcoder.pkl and stance_ideals.pkl, if it finds them, then it goes to multi stance, if it dosent it will do to single stance. For each camera frame it calculates the features and turns it into the variables (the same as when the data was collected) Then it sends it to the ai, which applies stance specific rules, if stance_encoded is 1 then it is set as kiba dachi, 0 is gedan barai (it has the capability for more but ive only trained it on two different stances at the moment).
If the stance is judged as being incorrect, it will generate feedback, it does this by first weighting the importance of different variables, for example knee issues get 1.5 weight as it is deemed as more important and head issues get 0.7 weight as they are less important, then it sorts them by weighted deviation. the larges weighted deviation will generate specific feedback and speak it aloud to the user.
For the feedback system, ive tried to make it understand what each variable means, so that it can give direct feedback based on the largest deviation without me typing out what the common issues are and it selecting them, this has mixed results and i am going to try to implement this more effectively next time.

Attachment
Attachment
Attachment
Attachment
Attachment
0
yibo

After the AI model is trained on what makes a good form and what is a imperfect form, i wrote a ‘decider’ script, that first calculates the angles and joints of the person (the same variables that the model was trained on), then “model.predict(current_data)”, triggers the custom trained model - karate_model.pkl, which contains hundreds of “decision trees” to check wether a stance is perfect or imperfect. If over 85% of the trees agree with the perfect stance, the stance will be judged to be good.

However, this isn’t enough, for a person to properly learn the stance, they need feedback, just like a real sensei would.

Once the program thinks that your stance is ‘incorrect’, it enters the coaching stage. It runs your data through a priority based system (the order was decided by the ai to be the most important variables in a good technique). It will show you the first mistake it finds at the top, once the most important condition is satisfied, it will move on down to the next most important until it deems your stance to be perfect.

Next i am planning to work on the ability to do this for multiple stances, as at the moment the model feedback will only have the correct order of importance for one stance. I am also looking to implement voice commands from the sensei so you don’t have to look at the screen every time.

Attachment
Attachment
Attachment
0
yibo

Using scikit-learn, i trained.a random forest model on the dataset i made previously. By giving it data of ‘perfect’ and ‘imperfect’ technique, this allows the ai what good technique is. From the 127 rows of data i fed it, the ai focuses mostly on the right knee, characterising that as the most important value.
Currently, with my very limited and possibly incorrect dataset the ai has achieved a 96.15% accuracy, however this is only for testing purposes, when my project advances, i will provide the program with more accurate data.

I am also working on a function that will allow me to scrape images of good technique from the web, allowing me to train the model on data without having to collect it manually.

After it is trained, the model is saved as a pkl file in the project folder, ready to be accessed by the executer program that allows the project to tell you if your technique is good or not.

Attachment
0
yibo

Currently, i am working on getting data to train my ai model on, staying with the basic stance of Gedan Barai, i asked my brother (who is good) and me (who is not as good) to perform perfect versions and inperfect versions. Using MediaPipe, the program finds landmark points on the body in 3d space (ish).

The script converts raw cooordinates into relative features. Using trigonometry to calculate the joint angles, the stance ratio with the distance between your feet and stance, center of gravity and more variables, i am able to form a dataset.

Then with manual labelling (by pressing ‘P’ or ‘I’) for perfect or inperfect, it writes a row to the dataset.
Now i will be working on using the dataset to train the AI to detect the quality of a stance. :)

Attachment
Attachment
2

Comments

kareemmm618
kareemmm618 about 1 month ago

cool

SuperNinjaCat5
SuperNinjaCat5 about 1 month ago

nice

yibo

Using mediapipe, i made a program that finds the position of body joints, joining them up to create a pipe frame of the limbs. From this, i calculated the angles of each arm and leg, this way i was able to detect the position of the body and correct the stance. i decided on trying to first code a still position, the Gedan Barai (downward block). I chose this because it is relatively simple stance and it had leg and arm positions that could be detected mostly through arm and leg angles. I improved on my original detection by using 3d vectors to try to more accurately find the positioning of limbs (results varied). Finally, i added a simple system, in which the colour of the joints shown turn from red to green, in response to the correctness of the positioning and a timing system, that allows you to see how long you hold your correct stance in.

Tomorrow i hope to add more stances, and work on the accuracy of the data as well as adding new variables, such as foot positioning and hip position. :)
I spent a lot of time researching different ways to make this project work, i will try using YOLO for positioning too.

Attachment
1

Comments

yefoi
yefoi about 1 month ago

holy cow that is awesome sauce