Sonex banner

Sonex

2 devlogs
3h 27m 28s

Sonex is an ML audio analysis library, it essentially combines essentia (https://github.com/MTG/essentia), whisper (openai), demucs (Facebook Research), and NLLM (META AI). Demucs isn't actively maintained and was created when Meta Ai was called F…

Sonex is an ML audio analysis library, it essentially combines essentia (https://github.com/MTG/essentia), whisper (openai), demucs (Facebook Research), and NLLM (META AI). Demucs isn’t actively maintained and was created when Meta Ai was called Facebook Research. Sonex gives both high-level (i.e note names, chord progressions, syllable lyric-sync (even across languages, through NLLB)), and low-level (pitch freqs) data.

levicafe08

Added:

  • Research script to test (will publish a graph with 2 SE) accuracy of whisper, whisperx, fasterwhisper (hoping to add more, c2translate shouldn’t add any changes between the weights of the whispers but apparently it has), whispersize (i.e large-v1, v2, v3, med, small), beam size (how many concurrent “beams” or paths it can explorer for what a word means i.e hallo (hello with an a) could be hello or halo, and it tries to find the best word based off of context), some other more minor settings for whisper, and testing each of those configs against argostranslate and NLLB for translation qual. (this part still doesn’t work)
  • Added a GUI to the generation portion of the script, the tried to debug loading but its impossible to pipe from whisper, I can callback but it doesn’t deliver progress
Attachment
0