Activity

darren

Shipped this project!

Hours: 16.14
Cookies: 🍪 428
Multiplier: 26.53 cookies/hr

I build a ML model to detect your hand and see which gestures you are doing including “Pinch, Cursor, Scroll Up, and Scroll Down”. This project has the goal to help individuals who are disabled or injured still be able to interact with their computers without having to be restricted to keyboard and mouse. It was a huge challenge for me to create a fast backend and frontend so that people would actually be able to interact with a demo of my model at a reasonable framerate (~10fps). I did this through moving the mediapipe logic to run locally on the JS and the Python would just send light packages with position and gesture data rather than full heavy images. In the end I just migrated all the python code to js so that it could all run locally without having to communicate between frontend and backend (m. I am proud of this project because it gave me hands on experience with ML models and big libraries like opencv, mediapipe, and working through a fast backend -> frontend with flask, and socketio and was overall very fun to work on!

darren

I had to migrate the model to js code using m2cgen python library because I wanted to migrate my website from render to vercel as detailed in the ship requirements. This intern made the model much faster because it doesn’t require communication between python and js (backend -> frontend). I really enjoyed working on this project and I learned a lot of ML and how backend and frontend work with each other and the challenges that comes with that.

Attachment
0
darren

Wow it has been a while since the last log. Lots of work and debugging. The biggest thing I worked on was the creation of a demo website and designing a way for others to easily experience my tool without having to download a desktop application. But its a lot so I will break it down into a list

  1. I had to create a full HTML, CSS, JS structure to get a proper frontend for the website demo and I had to improvise from the original plan because pyautogui doesn’t work at the browser level due to security issues. So I had to design a web app to guide any new user through the tool and learn all of the gestures

  2. I ran into heavy optimization problems while trying to host on the cloud as sending images from python to js 30 times a frame was incredibly heavy and demanding resulting in an unplayable experience. To solve this I migrated the mediapipe hand display from python to js so that python backend did not have to send the image over (this required crazy debugging).

  3. I realized with this latency my gestures were not detecting as well as I would hope so I turned to a deterministic approach which is just a fancy way of saying that if a specific piece of data that is very different for one gesture than the rest of the gestures is above a given threshold then just skip the predicting and hardcode the result for that frame. This improved the accuracy for the “Pinch” gesture from being ~50% at weird angles to 100% most the time because of this code.

This project has been a blast and most likely this will be the last log unless I find any pressing issues or see any areas for major improvement

Attachment
0
darren

I finished with all the basic gestures for my app for now and I moved onto optimizing the processes when I came across a bug that took me very long to fix. The problem was my program was detecting the correct gesture but would not go through with the action. After like a whole hour of trying to debug I finally realized that there was trailing whitespace that made it so the conditions to active the actions would not be triggered 😑. But good news is on the optimization standpoint I lowered the resolution of the model so it doesn’t have to render as much and I also moved the camera reading to another thread to unblock the main thread as well as I finally don’t look at the gesture every frame but rather every other frame which greatly improves the speed of the program. Overall great debugging and optimization progress but the model still struggles with recognizing some edge cases, I plan to focus on creating some kind of application so others can easily use the code by the next log.

0
darren

Great news! I improved the accuracy of the training model to 98.06% which is up from the 96% and it works much better now because I added an extra parameter that calculates the distance between the thumb and the pointer finger which is very distinct for each of my gestures. I also was able to fix the mirroring issue and now it works in the correct direction so it is actually functional and I can navigate and click around fine. I still need to finish implementing the rest of my gestures besides pinch and cursor but all the infrastructure is there to do so. I hope to push the project even farther in the next log

1

Comments

FIR3
FIR3 13 days ago

I LOVE YO PROGRESS

darren

I worked on collecting data for the model by recording consecutive frames of a given gesture and correctly labeling it (e.g “Thumbs Up”, [data]) then I passed almost a thousand lines of this data (each with 63 columns) into a RandomForest model and after some trial and error I came out with ~96 accuracy but I found that my model was mixing up “Pinch” & “Point” gestures so I need to fix that in the next log. Finally for this log I can actually control my mouse now through the pyautogui library but I still need to fix the fact that the cursor goes in the opposite direction than intuitive.

2

Comments

Da_BFBM_show
Da_BFBM_show 14 days ago

This is so cool! Can’t wait to see this in full action!

FIR3
FIR3 13 days ago

SO COOL OMG!

darren

Using MediaPipe I was able to relatively quickly get the joint of my hand to show up in realtime without having to train my own model from scratch which was very rewarding. In this time I laid the infrastructure for my project and got the program ready for the next step which would be to collect position data for each joint given a label (e.g. “Peace Sign”) and then train my own model using that data (hopefully by the next log).

Attachment
1

Comments

FIR3
FIR3 13 days ago

Yayyy :yurr: