HOMES-Engine banner

HOMES-Engine

20 devlogs
40h 25m 36s

HOMES-Engine is an automated, mobile-first video production ecosystem designed to generate high-quality, branded content directly from an Android device via Termux. It challenges traditional desktop workflows by orchestrating AI scripting, voice s…

HOMES-Engine is an automated, mobile-first video production ecosystem designed to generate high-quality, branded content directly from an Android device via Termux. It challenges traditional desktop workflows by orchestrating AI scripting, voice synthesis, and complex FFmpeg rendering entirely on restricted ARM64 hardware.

Key Features:

  • Absolute Cinema Pipeline: Advanced FFmpeg engine featuring dynamic color grading, Ken Burns effects, and studio-grade audio mastering.
  • Creator Branding Kits: Modular JSON profiles that dynamically inject your brand colors, logos, and style guidelines straight into the AI’s prompts
  • AI Core & VideoLM: Powered by Gemini 3.1 for neural TTS and agentic reasoning, newly bridged with NotebookLM (VideoLM) for generating long-form Life OS content
  • Autonomous Worker (Option 99): Runs in a continuous background loop, pulling tasks and rendering content without any human intervention
This project uses AI

I used Gemini CLI and Perplexity as my Senior Mentors to survive the chaos of mobile edge computing. I write the core logic, and when things break, I ask the AI why.

Specifically:
When concatenating video clips crashed the pipeline on ARM64 due to Sample Aspect Ratio mismatches, I spent hours debugging. I used Gemini CLI to parse the complex error stacks and find the exact filter_complex syntax (setsar=1 and format=yuv420p) to stabilize the factory.
I used AI to reason through the FFmpeg chains for EBU R128 audio mastering. It also helped me design a math-based heuristic algorithm to generate synchronized .vtt and .ass subtitles, bypassing the fact that pure .wav TTS APIs don’t provide timestamps.
AI helped me grab the right syntax to integrate Android notifications and haptic feedback via the Termux API so the phone physically vibrates when is videdone
And also to translate some things
That’s how it’s described, at least for me

Demo Repository

Loading README...

ChefThi

Almost finalizing the URL integration... Just to polish the bridge. I exposed the full NLM video options in the CLI terminal and shipped the terminal workflow.
Alt text
Crucially, I locked down the security: the engine now only consumes Hub dashboard jobs that have a signed status (HMAC validation). Debugging this API contract via Termux on the commute was a headache , but the integration is now secure and seamless. We are locked, loaded, and ready to render. Keep building.
Alt text
tw_video_camera

0
ChefThi
  • feat(engine): add VideoLM render bridge — videolm_client.py + video_maker delegation + DEMO.md (01e10a7)
  • feat(engine): add HOMES Hub integration — hub_client, queue_worker, modules (study/daily/finance) (f469180)

he Hub Connection
The engine is no longer isolated. I spent this hour breaking the engine out of its silo. I engineered the HOMES Hub integration (hub_client, queue_worker) so the mobile worker can finally sync with the central brain’s queue.

To take it further, I built the VideoLM render bridge (videolm_client.py). Delegating heavy video generation while orchestrating micro-services from a mobile terminaland dodging API rate limits is absolute chaos. Alt text
But building this Hardened connection is what gives the project its Essence.

Attachment
Attachment
0
ChefThi
  • feat(ai): professional script engine v3.0 — support for EN-US long-form content (bd05245)
  • feat(branding): full channel identity system v3.0 (4a5b536)

Over this 10 hours of brute-force edge engineering. This was the turning point where HOMES-Engine evolved from a script into a full production house. I implemented the professional script engine v3.0, adding robust support for EN-US long-form content (Life OS modules like trend_intel and skill_tree).

But the real magic was the full channel identity system v3.0. The AI now fully ingests the brand profiles before it writes a single word. I fought severe memory limits on the Android side, but using Gemini CLI as Mentor, we cleared the bottlenecks. The factory is officially next-level.

But Man, the 429 errors (which universally mean API quota limits have been reached) are breaking the workflow… working and not being able to validate the pipeline because you’ve used up all your quotas is sad and frustrating!
I found this image while searching, and it’s quite interesting and explains things.
Alt text

Attachment
Attachment
0
ChefThi

Almost prepared all Man I’d say it was pure structural preparation. I was organizing the directories and cleaning up the environment for the massive architectural overhaul that was coming. Setting up the boilerplate while fighting unstable mobile data isn’t glamorous, but laying down a Hardened foundation is what allows the engine to scale.

Attachment
0
ChefThi

Pipeline Whisperer
Just a short session, but it was all about validation. The “Video from pipeline” log was me testing the output directly on my A05s while bouncing on the bus. We (my class and a few others) are in the middle of exams I needed to ensure that the FFmpeg concatenations and the recent setsar fixes were holding up without dropping frames. When you’re running complex workflows on an ARM64 processor, you have to verify every output. Small, silent tweaks to keep the factory Resilient.
Because for me, it has to be: Alt text

0
ChefThi

For Keeping the pipeline healthy in this session, critical I’d say. The factory was showing some cracks in the FFmpeg assembly logic. When you’re running complex filter_complex chains on a mobile ARM64 processor, things can get messy if the code isn’t 100% tight.

I spent approximately 30 minutes debugging the engine core with the CLI Fixed a broken loop in the render sequence that was causing frame drops. In this business, if the pipeline stops, the factory dies. Everything is back to 100% stability now. Ready for my tests

Another thing I found cool and am discovering more about is the formatting in .md files (which you can do in Flavortown posts). I saw that if you put two pairs of underscores (__) between words, they become underlined. If you use only one pair, they become italicized.

Attachment
0
ChefThi
  • fix(core): resolve pathing errors and UI color value mismatch (caeb79a)

Spent the last few hours moving the engine from a generic generator to something more personal. The main focus was the new Creator Branding Kit.

The Branding System

I implemented a modular branding folder structure. Now you can drop your own logos, define specific brand colors in a JSON file, and set a custom “style prompt” for the AI.

  • The engine reads these configs and injects the style directly into Gemini before the script is even written.
  • Updated the main CLI with a profile selector. Now I can switch between different creator identities right at launch.

Debugging & Cleanup

It wasn’t all smooth sailing. I ran into a few annoying bugs:

  • Had a ValueError in the main menu because I messed up the ANSI color unpacking (forgot a variable for RED, so the whole UI crashed).
  • Encountered a ModuleNotFoundError when running the AI writer standalone. Fixed it by forcing the project root into sys.path.
Attachment
Attachment
0
ChefThi
  • chore: remove devlog directory and stop tracking logs (739a6a8)
  • feat(cli): add brand selector and dynamic banner to main menu (0ebef0a)

new things for the brand in the app. I prepared the directories and oyher things to setup all system of branding for the processed videos in with the system. Fix a a part of the pre-assets used in the assembly

Main changes:

  • I aded the brand selector in the menu. For the user pick the prefered identity for the video
  • Enhanced the menu
  • Search the local of the error with the video assemby

I’m in a hurry so I didn’t do such a complete overview this time. I didn’t detail the formatting or important things I did, changed, etc.

Attachment
Attachment
0
ChefThi
  • docs: add Branding Kit update devlog (7c5aa82)

V1.8 Creator Brand Kit

Headline: Moving from generic clips to a Personal Video Studio

I was tired of the engine spitting out “default” looking videos. A real factory needs a brand. I spent an hour hardening the core to support a Creator Branding Kit.

The Tech behind this

  • Branding Injection: Created a modular folder system where I drop logos and a brand_colors.json. The AI writer now ingests these configs before it even starts thinking about the script. It’s not just generating text anymore; it’s adopting a “voice.”
  • Identity Selector: Updated the CLI so I can swap between different creator profiles on launch. One factory, multiple brands.
  • The “Hustle” Fixes: Had to hunt down a ValueError in the UI because I messed up the ANSI color codes in the menu. Also fixed a ModuleNotFoundError by forcing the project root into sys.path. Small bugs, but they break the flow when you’re coding on the move.

The engine is no longer a script; it’s a tailored production Absolute Cinema how would I say

pweease see the fourth attachment to the post

Attachment
Attachment
Attachment
0
ChefThi

Today I integrated the engine deeper into the Android OS via Termux API. The focus was on user experience and system feedback.

Key Updates:

  • Haptic Feedback: The phone now vibrates upon successful render completion.
  • System Notifications: Implemented Android notifications to alert when a video is ready in the Downloads folder.
  • Audio Feedback: Added a voice confirmation (TTS) when the export process finishes.
  • Storage Fix: Hardened the file-saving logic to use reliable Termux storage paths.
Attachment
0
ChefThi

Today I focused on testing the “Voice Mode” pipeline. The goal was to ensure that a spoken idea could be transformed into a cinematic video without typing a single word.

🎙️ Voice Input Testing

I integrated the Termux API’s speech-to-text functionality with the v1.7 rendering engine. It captures audio from the mobile microphone, converts it to text via Google services, and immediately triggers the script-to-video workflow.

🛠️ Stability and Bug Fixes

During testing, I identified and fixed two critical issues:

  • Audio Mastering Restore: Fixed a bug where the EBU R128 loudness normalization filter was missing from the FFmpeg engine after a recent cleanup.
  • Reliable Export: Refactored the file saving logic. Instead of trying to write directly to the Android root, the engine now uses Termux symbolic links (~/storage/downloads). This fixed the issue of videos not appearing in the gallery.
0
ChefThi

To make it feel like real Absolute Cinema, I moved from basic VTT to ASS format for that nice word-level highlighting (karaoke style). I also added dynamic color grading with contrast, saturation and vignette, plus proper audio mastering using EBU R128 at -14 LUFS so every video comes out with consistent professional volume.

The Ken Burns zooms are now smoother and optimized for ARM64/Termux. And yeah, I spent time cleaning up all that legacy lab code — the repo is leaner with 34 commits and a much cleaner modular structure.

_ Iterative wins on mobile are tough, yet the pipeline finally feels ready._

Attachment
Attachment
0
ChefThi

The highlight of these last tweaks is definitely Option 99 (Autonomous Mode). The engine now runs in a continuous loop, grabbing scripts from the queue and rendering everything without me touching a single button. Pure magic after all those manual tests.
It took me longer than I wanted because I had to make the queue stable on mobile and handle errors gracefully.

This is the kind of feature that makes the whole project feel next-level. Next hour I’ll talk about the visual polish! 🎨

0
ChefThi

After some long nights fighting VTT sync on Termux, the engine finally jumped to v1.7. It’s no longer just a script; it’s starting to feel like a real autonomous worker.

I spent way the time just staring at timing errors and testing the same short clips over and over. I took a long time with this because of Termux limitations on ARM64 — every time I adjusted one thing, another broke. But it was worth it. The pipeline is more solid now and the project has gained a much more professional look.

Small wins, but they add up.

Attachment
Attachment
0
ChefThi

🤖 The End of Manual Labor (Autonomous Mode)
I’ve officially implemented Option 99 (Autonomous Mode). The engine now runs in a continuous loop,
watching for script files or backend signals. If a new idea hits the queue, the engine captures it,
renders it, and delivers the final video without me touching a single button. This is the foundation for
scaling production on mobile.

🎨 The “Absolute Cinema” Look (v1.6)
I wasn’t happy with “raw” renders. To give the videos a signature look, I added a post-processing layer
directly in the FFmpeg chain:

  • Color Grading: Dynamic contrast and saturation boosts.
  • Vignette Effect: That classic “cinema” dark-border focus that draws the eye to the center.
    The output now feels like a finished product, not just a test render.

🔊 Studio-Grade Audio (EBU R128)
Consistency is key. I implemented Audio Mastering (Loudnorm) to hit the industry standard of -14 LUFS
(the same used by YouTube and Spotify). No more videos that are too quiet or clipping—everything sounds
professional and balanced.

🧹 Clean House, Clean Mind
I did a deep cleanup of the GitHub repository. I used git rm –cached to strip away internal roadmaps
and simulation logs an tests of the pipeline. The public repo now holds only the pure Engine Core, keeping my portfolio sharp and focused on the code that actually matters.
P.S. Ieave the screen turn-off and paused the recording

0
ChefThi

what I did in two months (in the case of this project)


I haven’t posted a devlog for HOMES-Engine in about two months. Not because I wasn’t working on it — actually the opposite. I was heads-down testing, breaking things, fixing things, and Honestly sometimes just staring at FFmpeg error messages trying to figure out what went wrong. This is that story.


Where it started — Jan 4, day zero

The first commit was a proof of concept: a basic Python script that called FFmpeg and generated a video. That’s it. It barely worked. The font was wrong, the imports were broken, the output format was inconsistent. But it rendered something, which felt like enough to keep going.

In the same day I went from v0.1 to v1.3, v1.4, and v1.6 in rapid succession. Each version was fixing something the previous one broke: Edge-TTS for neural narration, multi-line text rendering (I kept getting those quadradinhos — encoding artifacts from special characters that took forever to track down), synchronized VTT subtitles, dynamic B-Roll stitching, music ducking. I was running all of this on Termux, on Android, ARM64. FFmpeg on ARM has its own quirks that aren’t documented anywhere useful.


The SAR bug that took too long

One thing that slowed me down more than anything else was a SAR mismatch error in ffmpeg_engine.py. When concatenating video clips, FFmpeg was crashing because different clips had different Sample Aspect Ratios. The fix was two FFmpeg flags: setsar=1 and format=yuv420p. Simple fix — once you know what it is. Finding it took hours of testing different inputs, reading logs, and using Gemini CLI to help me parse what the error stack actually meant.

I used Gemini CLI a lot during this phase. Not to write the code for me, but to help me reason through FFmpeg filter chains. FFmpeg’s filter syntax is its own language and when you’re building complex pipelines.

0
ChefThi

Título: HOMES-Engine 3.1 — Gemini TTS, Hybrid VTT & Integration Hardening

Commits:

  • c1fb79a — feat(core): implement Gemini 2.5 Flash TTS engine with multi-speaker
    support
  • 93cb143 — feat(video): integrate Gemini TTS with heuristic VTT generator
  • 1e053f4 — feat(integration): align queue poller with AI-VIDEO-FACTORY API specs
  • d5764e3 — chore(security): update gitignore for local simulation and fix poller
    paths

Resumo:
Sessão intensiva de upgrade da Engine para v3.1. Implementação nativa de Voz Neural
(Gemini), sistema de legendas sem timestamps e alinhamento total de segurança/API com o
backend de orquestração.

O que foi feito:

  • Gemini TTS Nativo: Substituí o motor de voz antigo pela API v1beta do Gemini,
    habilitando vozes ultra-realistas (“Kore”, “Fenrir”).
  • Legendas Híbridas (Math-based): Desenvolvi um algoritmo heurístico para gerar
    arquivos .vtt sincronizados, permitindo legendas visuais mesmo usando APIs de áudio
    puro (WAV).
  • Poller de Integração: Implementei o worker que conecta ao AI-VIDEO-FACTORY,
    ajustando endpoints e payload para o spec oficial.
  • Segurança: Blindagem do .gitignore para simulações locais e limpeza de artefatos.

Por que foi feito:
Elevar a qualidade cinematográfica dos vídeos (voz melhor) sem perder a acessibilidade
(legendas), enquanto preparo a infraestrutura para rodar de forma autônoma e segura em
produção.

Resultados / Status:

  • Vídeos gerados agora possuem qualidade de estúdio (Demo no anexo).
  • Worker pronto para testes E2E com o Backend NestJS.
  • Ambiente local limpo e seguro.
Attachment
Attachment
Attachment
Attachment
Attachment
0
ChefThi

Título: HOMES-Engine — Iteração Studio & Estabilizações (sessão pós-v2.1)
Data: 2026-01-06
Commits:

  • 2587dfe — feat(visuals): implement color conversion engine and update learning lab
  • a8feb18 — feat(tts): set Google Gemini 2.5 TTS as primary engine
  • 2d483ab — docs: add system architecture overview and update readme v3.0
  • f868a70 — fix(ffmpeg): standardise SAR and pixel format for concat stability
  • ae25fe9 — feat(v3.0): add Smart Assets (Image Gen) and experimental TTS via Pollinations.ai
  • bfecd9f — refactor(arch): extract ffmpeg engine and improve audit tools

Resumo: Sprint focada em estabilidade do pipeline multimídia, promoção do Gemini TTS como engine principal e melhorias visuais programáticas para THEMES.

O que foi feito:

  • Visuals: criado core/color_utils.py e refatorados temas para usar constantes RGB, permitindo paletas geradas dinamicamente.
  • TTS: integrado Gemini 2.5 Flash TTS como prioridade; tts_engine atualizado com fallback limpo.
  • FFmpeg: padronizado SAR e formato de pixel (setsar=1, format=yuv420p) para evitar erros de concat em ARM64.
  • Arquitetura: extraído ffmpeg logic para core/ffmpeg_engine.py; melhor auditabilidade e verificação de segredos.
  • Assets/IA: adicionado ImageGenerator experimental (Pollinations/FLUX) e scripts de verificação de configuração.

Resultados / status:

  • Pipeline completo funciona em ARM (concat estável, ducking e VTT testados rapidamente).
  • TTS principal configurado — testes de qualidade/latência pendentes.
  • Documentação e guia de arquitetura atualizados (Readme v3.0).

Próximos passos:

  • Parametrizar prompts do Gemini (controle de tom, gancho e extensão).
  • Automatizar geração de paletas THEMES via color_utils.
  • Criar testes end-to-end simulados (CI) para concat/ducking sem assets pesados.

Sugestões de anexos:

  • Terminal.log com prova do render (setsar fix).
  • Vídeo curto 10s mostrando tema + legenda + áudio Gemini.
Attachment
Attachment
Attachment
Attachment
Attachment
Attachment
0
ChefThi

DevLog: HOMES-Engine v2.1 – AI Studio & Arquitetura Modular**
Data: 2026-01-05 | Horas Gastas: ~6h

🚀 Commits Principais

  • 7d477d7 — feat(v2.1): Architecture Overhaul & Gemini AI Integration 🧠
  • 7fecb45 — feat: Absolute Cinema v1.6 - Dynamic B-Roll & Sinc Subs
  • 4402ddd — fix(core): Correct imports and asset management

📝 Resumo da Evolução

Reestruturei o motor para um modelo de Studio Modular. O foco saiu de scripts isolados para um pipeline integrado onde o Gemini atua como o “Cérebro” da criação, garantindo automação de roteiros e estética cinematográfica (Absolute Cinema) rodando 100% em ambiente mobile.

🛠️ O que foi implementado:

  1. Arquitetura Core: Migração para estrutura modular (core/), isolando ai_writer, render e I/O. Isso permite escalabilidade e chamadas limpas da API do Gemini.
  2. AI Writer (Gemini): Integração do núcleo de escrita. Agora, o motor gera roteiros estruturados a partir de tópicos simples, salvando o output em scripts/ para processamento imediato.
  3. Visual Engine: Implementação de efeito Ken Burns (ZoomPan) e upscaling Lanczos. Adicionei suporte a THEMES configuráveis (JSON), permitindo mudar a estética do vídeo sem alterar o código.
  4. B-Roll & Subs: Sistema de seleção dinâmica e randômica de clipes de apoio. Geração de legendas VTT sincronizadas com tratamento de escape de caracteres especiais.
  5. Áudio Pro: Pipeline de mixagem com Audio Ducking (redução automática do volume da trilha durante a voz) e introdução musical de 2s para branding.
  6. Otimização de Repo: Limpeza de arquivos pesados no Git, .gitignore reforçado e separação clara de assets/, renders/ e cache/.

📊 Status & Resultados

O v2.1 (AI Studio) já opera em Prova de Conceito (PoC): O fluxo Ideia → Gemini → Script → TTS → Render (720p) está funcional e automatizado. O repositório está leve, modular e estável.

Attachment
Attachment
Attachment
Attachment
Attachment
0
ChefThi

🚀 Devlog: HOMES-Engine Genesis & Mobile Pipeline (v0.1)

O motor do HOMES-Engine começou a rodar! O foco inicial foi estabelecer uma pipeline funcional de “Ideia para Vídeo” rodando inteiramente em ambiente mobile (Termux), otimizando recursos para garantir que a renderização não “frite” o processador do celular.

🏗️ Mudanças Técnicas:

  • Genesis da Pipeline (Termux + FFmpeg):

    • Implementação do video_maker.py, um core de renderização otimizado para Android. Utiliza o preset ultrafast do libx264 e crf 28 para equilibrar velocidade e qualidade em dispositivos móveis.
    • Criação do main.py focado em automação via Termux API. O sistema agora captura ideias via Voz (Speech-to-Text) ou Clipboard, injeta diretrizes de branding (“Absolute Cinema”) e gera prompts prontos para o Gemini.
    • 9550b44 - 🚀 INIT: Genesis of HOMES-Engine
  • Refinamento de Core & Identidade Visual:

    • Fix de Importação: Corrigidos typos críticos no main.py que impediam a execução do script no ambiente Python do Termux.
    • Assets de Marca: Adição da fonte Montserrat-ExtraBold na pasta assets/. Ela agora é injetada via filtro drawtext do FFmpeg para garantir que as legendas tenham impacto visual cinematográfico.
    • 4402ddd - fix(core): correct import in main.py and add assets

💡 Por que isso importa?

Diferente de editores pesados, o HOMES-Engine é focado em headless production. A modularidade do main.py permite que o roteiro gerado seja salvo localmente e enviado automaticamente para o clipboard, agilizando o workflow de criação de vídeos faceless sem sair do terminal.

Status: PoC validada. Próximo passo: Automação da montagem de B-Rolls. 🚢🔥

Attachment
Attachment
Attachment
Attachment
Attachment
Attachment
0