a tool that would help me do other stuff while im in online lessons
(takes a screenshot, gets the answer from hcai, uses a local model to clone my voice and say the answer in my voice in a zoom meeting)
- f5 cloning debugging, voice model setup.
a tool that would help me do other stuff while im in online lessons
(takes a screenshot, gets the answer from hcai, uses a local model to clone my voice and say the answer in my voice in a zoom meeting)
made a executable using pyinstaller, and the project would only work on macos, as it uses specific macos only utils like screencapture, and applescript! i hope the readme is well descripted, and makes it easy to run, fixed bugs wihth the release, it now works! a venv is needed with f5_tts_mlx installed, the readme better mentions it all.
Log in to leave a comment
made a simple tui interface using rich, i did have to change the two files a tiny bit, to make imports from the cli work properly, i dont plan on really making this project more advanced, would probably ship after packaging this, also rich is fire.
Log in to leave a comment
so, basically this is a tool which would make life very easy, I just take a screenshot of a question, then the question is passed on to hcapi, and it gets the answer, clones my voice, says the answer in my voice.
so far, i have the basic loop ready, where i take a screenshot (native macos screenshot
), and the image gets passed on to the hc api gemini model, and it gets the answer. and then the model says the answer. the model is a bit finnicky, but it works.
Log in to leave a comment