VideoLM banner

VideoLM

53 devlogs
150h 37m 33s

VideoLM: My Absolute Cinema Factory

I conceptualized this mid-last year (not this repo/project) and later migrated the core to TypeScript to truly harden the architecture.

This full-stack video pipeline digests custom sources (PDF

VideoLM: My Absolute Cinema Factory

I conceptualized this mid-last year (not this repo/project) and later migrated the core to TypeScript to truly harden the architecture.

This full-stack video pipeline digests custom sources (PDF, DOCX, PPTX, TXT, MP4) to craft accurate visual narratives and commercial YouTube videos automatically.

The Factory:

  • Brain: Python/MCP bridge with NotebookLM extracts facts and generates audio overviews from uploads.
  • Vision: Gemini 3 maps research into a 10-scene storyboard with aesthetic steering.
  • Muscle: Hardened NestJS & FFmpeg backend handles b-roll sync, audio ducking, and branding.

Built to survive in VMs and Free Tier limits.

Read the repo’s README.md and docs/ to see the real architectural sweat under the hood.

This project uses AI

AI tools were supportive senior mentors (like I’d say). Architecture, coding, and for sprint the final ship Ohh (almost forget) and for translate for English was I wrote in some parts

Gemini CLI googlegemini : Used in my dev environment. I didn’t know FFmpeg math or Docker when I started. It explained real-time errors, suggested container fixes, and guided me through deployment limits.
Perplexity**: Used to hunt updated docs and research DevOps best practices. It helped me learn Docker patterns, FFmpeg tricks, and tech stack comparisons to harden the core.

Demo Repository

Loading README...

ChefThi

found another Lapse


Dude, I did some small tests and tested the integration with the Engine. I think there should be a commit changelog here, but I forgot to push it. There aren’t many modifications or important things, but I found some things to tinker with… That’s it, but the system is really cool, I hope you like it and enjoy it 😃🫡😎

Attachment
0
ChefThi
  • Prepare production deployment and docs (71c9685)

Wow… It worked! Man, I’ve been tinkering with various parts of the project these past few days (actually since last month), like the integrations, things with NLM, the server/deploy, but recently, as I said before. I reorganized and asked CLI to clean up and optimize the code in these last few days, I reorganized video processing modules and classes, and also studied chunk processing.
I synchronized recent Lapses, man, I had a few lost ones, maybe I still do 😅…

I wasn’t feeling very well or inspired to write and post, I kind of didn’t want to either, so because of that I forgot to synchronize the Lapses and the development. But in the end everything worked out.

Attachment
0
ChefThi
  • feat: improve audio normalization, dynamic b-roll sync, and automated branding watermark (788bd5d)
Attachment
0
ChefThi

integrate with Engine finally firemc


Engine Integration for Branding

Connecting the final pieces. Focusing on how to inject an automated branding watermark into the FFmpeg render pipeline. The goal is to make every generated video instantly recognizable as a product of VideoLM. tw_video_camera


The polishing phase has begun. happi

Attachment
0
ChefThi
Audio Normalization Research (WIP)

Pushing the audio quality to the limit. Researching and testing FFmpeg audio filters. I want to implement a robust audio normalization system so the background music doesn’t drown out the TTS narration.


This is heavy WIP. I’m drafting the smart ducking logic to ensure commercial-grade audio balance. The math here is complex but necessary to achieve the Absolute Cinema feel.


The struggle to balance frequencies.

Attachment
0
ChefThi
Live Testing the Demo Endpoint

Watching the terminal output like a hawk. The log just said “print” because I was literally tailing the server logs monitoring the HOMES-Engine bridge. I needed to ensure the new JWT-less endpoint was handling the payloads correctly without throwing 404s.


Validating the ship.

Attachment
0
ChefThi
  • feat(video): add demo assemble endpoint — no JWT required, HOMES-Engine bridge (03beb3d)
  • feat: finalize frontend bypass and inline styles for robust demo (4c99508)
  • feat: complete the 200% Absolute Cinema bridge - automated NLM to Gemini pipeline (cf17a79)

The 200% Absolute Cinema Bridge

The most critical marathon of the week. This is where the foundation paid off. I pushed the `200% Absolute Cinema bridge, automating the pipeline from NLM directly to the Gemini scriptwriter.


I also delivered the demo assemble endpoint without JWT and finalized the frontend bypass. The factory is now fully integrated and reviewer-ready. We survived the dependencies and the API Free Tier quotas.


The engine is roaring. yay

Attachment
Attachment
0
ChefThi

Post

Attachment
0
ChefThi

Prepping for the reviewers. Starting the frontend bypass logic logic . The goal is to make sure the DEMO doesn’t break if a user doesn’t have a JWT token Alt text. I’m building a fallback mechanism so the Reviewers can just click and see the magic.


Making it accessible without compromising security securekey happi

Attachment
Attachment
0
ChefThi

bro
space_1

RAM Protection & Zombie Cleanup

The struggle with the 4GB VM is real. I was frantically implementing the pkill -f chromium hook. If we don’t kill the headless browser after the NotebookLM extraction, the server will hit an Out of Memory (OOM) error. ramm


Survival mode for the infrastructure

Honestly, it’s very complicated for those who work with backend projects, and having a more dynamic frontend with many integrations makes it much worse because finding a machine to deploy on is difficult. There’s HC’s fhs-hackclub Nest which already helps a lot, but depending on how many projects and things you need to host, it might not be enough… Thank God I recently created my AWS aws account and managed to create VPSs to deploy my projects. I still have some free trial dollars zach-dollar . I think it’s enough for 20 days or more. I didn’t mention it, but their dashboard and resource sections are very intuitive and easy to use. I liked it a lot! nice-to-meet-you

Attachment
0
ChefThi

only this… fb-sad


B-Roll Synchronization Logic

Just trying to get the timing right. Starting to conceptualize the B-roll dynamic sync. How do we make the infographic appear exactly when the audio mentions a statistic? I’m outlining the FFmpeg Alt text overlay filters that will make this happen in the future.


It’s math and timing calculator

Attachment
0
ChefThi
Project Metadata Refactoring

Keeping the database sane. I need to make sure the SQLite DB Alt text databaseparrot doesn’t lose track of the artifacts when the background worker picks up the heavy renders. Hardening hard the persistence layer so can poll the status properly.
space_1


Behind the scenes maintenance care

Attachment
0
ChefThi
Hybrid Pipeline Architecture Draft

The ghost logs are just me fighting the infrastructure fight-wx the reality is I was drafting the startHybridAbsolutePipeline logic. I started mapping old-map how to pass the NotebookLM facts into the Gemini script generator. googlegemini


This is heavy The bridge isn’t crossed yet, but the blueprints are drawn

Attachment
0
ChefThi

prompting to CLI and testing the features


Feature Testing via CLI

Another quick session surviving the terminal. continuing to test the CLI tools for the NotebookLM integration. We are not fully automated yet, but I’m validating that the commands commandv1 can extract the exact facts we need without timing out the server. IT’S IN THE IMAGE IS REAL, AND YOU CAN PRATICE THESE THINGS TO SEE GOOD RESULTS. At least for me it will… and helped-me a lot)
Alt text
server_error


Building the Absolute Cinema pipeline block by blok my dears ms-brick yay-67

Attachment
0
ChefThi

melhorando e testando todas as rotas e usos com a integração do NLM
emo-bored

NLM Integration Route Testing

Testing every single route so the backend doesn’t melt in production. testing the NotebookLM API integrations. This is purely foundational work blob_work right now. I’m hitting the endpoints manually many to ensure the payloads match the expected JSON structure before the orchestrator fully takes over. Alt text


space_1 Ensuring the data flows before the cameras roll.

Attachment
0
ChefThi

got improvements and prompting the CLI to fix see errors

CLI Debugging & Error Tracking (WIP)

The API rate limits are hitting hard today. Used Gemini CLI as a senior mentor (Wow, I’ve learned a lot so far about FFmpeg and other things with it. Basically, that’s what goes on behind the scenes of any video editor) to debug the FFmpeg errors. Dude I get ore than I expected. Working with filters and all those adjustments, graphs, and edits is a lot of math calcurse … Good thing I’m studying Computer Engineering. The visual pipeline is throwing syntax errors, so I’m prompting the CLI to find the exact bottlenecks before we scale. blobby-computer


Small steps to keep the factory from crashing. ms-factory Alt text

Attachment
0
ChefThi
Ship Planning & Architectural Review

No massive commits here. I spent this time auditing the deployment architecture server and planning our final moves. I was mapping out exactly what needs to be refactored in the pipeline for the final ship target. Sometimes you have to step away from the terminal, read through the documentation, and plan the infrastructure so the factory doesn’t crash in production with those 429 rate limit errors. angrycry


Mental hardening is just as important as code hardening. braindump

Attachment
0
ChefThi
  • feat: implement AstroLab-style UI and upgrade to Gemini 3 family (d2665e5)
    to
  • fix: complete Veo 2.0 standalone lab and harden frontend service bridge (9af3522)

Pushing through the UI logic and the new video engines. setting up the Veo 2.0 standalone lab.

I started the implementation of the AstroLab-style UI. Again, the integration is not completely finished yet, but I am building the frontend service bridge so it can handle the Gemini 3 family googlegemini upgrades properly in the near future. I also worked on isolating the Veo 2.0 google lab logic. By hardening this bridge now, we ensure the frontend won’t break when we fully plug in the heavy video generation later.


Building the UI while making sure the backend doesn’t melt. yay

Attachment
0
ChefThi
  • style: implement cinematic transitions and robust image fallback system (d390f02)
  • deploy: implement production dockerization and repository sanitization (f615a64)

The Foundation of Cinematic Fallbacks

Another battle with the infrastructure to keep the factory running. I logged 1h 26m here starting to draft the logic for the cinematic transitions and the robust image fallback system (Commit d390f02).

I want to be transparent with the reviewers: this is still a Work in Progress (WIP). The system isn’t fully rendering the transitions 100% yet, but I laid down the core logic. If the primary image provider fails due to API limits, the pipeline needs to know how to switch without crashing. I also spent time on production dockerization (Commit f615a64), starting to sanitize the repository and prepare the container structure so it survives when we ship it to the VM.

The struggle with RAM is real, but the architecture is getting hardened.

Attachment
0
ChefThi
  • feat: implement intelligent artifact selection and media prioritization (d0536d8)
  • feat: implement research dashboard and community-focused documentation (6c16ccb)

Based on the tests I’ve done, and considering how development is going here, I’d say that if I don’t encounter any unforeseen problems, I should be able to finish the project in 5 days. For here I transformed VideoLM from a terminal-only script into a real, accessible tool. I spent these hours hardening the system for the reviewers and the community.

Thigs I improved

  • Community Research Dashboard: Built a dedicated UI component where anyone can paste URLs and watch the “log” of the hybrid orchestrator (Python + NestJS) in real-time. It’s about total transparency—showing exactly how the AI is studying the sources before it starts the cameras.
  • Intelligent Media Selection: Refactored the artifact selector logic to prioritize Cinematic MP4s over standard audio. The engine now detects artifact types, handles dynamic file extensions, and ensures the user always gets the Premium version of their research results.

Hardening the Core:

I also fixed a critical Circular Dependency issue between the Ai and Video modules that was causing the app to crash on boot. Using forwardRef() and stabilizing (again feelsbad ) the DB connection lifecycle, I’ve made the system
Ship Ready for a 24/7 VM deployment. I got one server happi

Attachment
0
ChefThi

I spent this session digging into the NLM engine to squeeze out every bit of visual quality. I decided to move away from simple audio-only podcasts and finally unlocked the Cinematic Video Overviews natively from the NLM Studio (like in the site).

I implemented full support for Style Steering. VideoLM now doesn’t just “request” a video; it dictates the aesthetic narrative—be it Watercolor, Anime, or Classic—directly to the Google engine. I also refactored the retrieval worker to handle massive 50MB+ MP4 files asynchronously. This ensures that high-fidelity assets are downloaded and linked to the project without timeouts or corrupted buffers. The factory is now producing real cinema absolute_cinema , not just slide-shows (how it was a while ago)

There’s something satisfying about seeing a research link turn into a beautifully animated watercolor scene. It feels like the AI is finally “feeling” the data it studied. tw_stars

Attachment
0
ChefThi
  • fix: stabilize database connection lifecycle and gemini image config (42e846d)

Connected the brain (Research Mode) to the muscle (FFmpeg engine) because it is fixing the connection to the database.. The goal was to eliminate the manual gap between knowing facts and showing facts

The Technical Win (I’d say):

Architected an orchestrator that analyzes dense research output and projects a 10-scene storyboard automatically. I leveraged Gemini 3 Flash to generate contextual visual prompts based on the research sources, then injected them into our custom assembly pipeline. By using a hybrid logic that syncs external high-fidelity audio with AI-generated visuals, I’ve created a seamless flow. Every Ken Burns movement and every transition is now timed to match the factual narrative discovered in the research phase.
blobmuscles
tw_video_camera

image-not-found

Attachment
Attachment
Attachment
0
ChefThi

I finally put the visual pipeline to the test. I took the Audio Overview generated in the research phase and overlaid the images using the logic I’ve been hardening over the last few sessions. Seeing the research actually turn into a timed video with visual assets is a huge win for the factory.

The resulting video is way too heavy to upload directly here, so I’ve made a folder available with the test file so you can see the output. It’s the first real look at how the system handles the transition from raw research to a structured visual storyboard.

So the video ended up a bit buggy and the transitions weren’t done well; the free quota of APIs I use had run out, so I was left with some pretty bad ones.

ok-cry

I used the Gemini CLI to help me debug the FFmpeg pipeline and speed up the generation of this test run. It saved me a lot of time on the mechanical parts of the code, especially while fine-tuning the sync between the audio and the overlays.

tw_link final research

Used Gemini CLI to debug the visual pipeline and accelerate the test video generation. googlegemini

Attachment
Attachment
Attachment
Attachment
Attachment
Attachment
0
ChefThi
  • refactor: resolve circular dependencies and implement research-to-video visual pipeline (9f3f517)

Problem with Circular Dependency

As VideoLM grew, my NestJS services started pointing at each other in a loop. The app wouldn’t even boot properly. I had to stop everything and refactor the core architecture to resolve these circular dependencies. It’s the kind of “invisible” engineering work that takes hours and doesn’t show up in the UI, but it’s what makes the system hardened and professional. No more “spaghetti code” holding back the factory.

Research-to-Video is Live:
The big win was implementing the Visual Pipeline. I finally connected the brain (Research Mode/NotebookLM) to the muscle/engine (FFmpeg).

The system now takes the factual deep-dive and automatically maps it into a 10-scene storyboard. It’s not just generating text anymore; it’s orchestrating how those facts are translated into visual scenes with Ken Burns effects and transitions. The bridge between raw data and a structured video is officially built.

ts

Attachment
Attachment
Attachment
0
ChefThi
  • feat: implement research-to-storyboard orchestration bridge (035f6ae)

finished the logic to turn those dense NotebookLM results into actual YouTube-ready videos. The “Factory” is no longer just researching; it’s actually building the final product.

I say Technical Win:
I built a dynamic assembly logic that handles high-fidelity external audio. The pipeline now auto-generates 10 visual scenes based on the facts it finds, then uses FFmpeg to render everything with Ken Burns effects and smooth transitions.

Current:
I’ll be honest: I haven’t run a full end-to-end test yet. But I’ve been digging deep into the research and engine files, editing the core logic and using the CLI to debug the assembly process piece by piece. The structure is there, and with the CLI debugging shows the pipeline is ready to move.

blobby-video_game

Attachment
Attachment
Attachment
Attachment
0
ChefThi
  • feat: implement end-to-end NotebookLM research loop and persistence fix (9ac4354)

Today I finally finished the full lifecycle for the factual research. The system isn’t just shouting orders at NotebookLM anymore—it actually stays on the line, monitors the progress, and pulls the generated audio/video files back to the local server automatically.
I also spent some time fixing some annoying DB persistence bugs. The AI data was de-syncing from the user projects, but now everything is properly linked. It finally has “memory” and handles the artifact downloads on its own.
The factory is starting to feel solid.
’

Attachment
Attachment
Attachment
0
ChefThi
  • feat: implement active NotebookLM orchestration and source ingestion (82fc473)

Today I finally got the Research Mode orchestration to behave. It’s a huge jump for VideoLM—it’s not just a video pipeline anymore; it’s actually acting like a research agent.

The technical mess I had to fix:
The Node-Python Bridge: This was a nightmare. Getting NestJS to talk to the Python engine for NotebookLM while Google kept blocking auth in the cloud was driving me crazy. I ended up using a metadata.json to sync the session. It’s a bit of a workaround, but it stabilized the whole thing. Now the backend creates the notebook, injects the URLs/PDFs, and triggers the “Deep Dive” automatically.
SQLite Persistence: I was losing data because of how I was handling the database during crashes. I hardened the logic so each project now locks onto its official notebookId. No more losing the research data if the process restarts.

The Proof:
Ran a final test with two heavy links about AI and SaaS. The pipeline created the notebook, fed the IA, and started generating the factual podcast script without me touching anything. With a Artifact ID

Attachment
0
ChefThi

Testing because I lost my Devlog hours and they all accumulated here… I don’t even know how long I was supposed to schedule this Devlog. I started keeping an eye on this project because of the previous error, and now there’s a new one… but yesterday

I worked a bit more in the integratio with python nlm. Tested the system with cookies from my PC browser because I started the development iof this project in the Cloud and continued here (I think I’ll finish in the same way)

Attachment
0
ChefThi

Testing because I lost my Devlog hours and they all accumulated here… this Devlog should have been around 5 hours, but yesterday I tried to do the Devlog and hackatime.hackclub.com was giving a server error (error 500), so it couldn’t make any posts or register time with wakatime.
From 5h go to 45h

Changes

  • feat: implement research ingestion logic and secure controller (d100291a)
  • docs: standardize technical documentation and add NLM integration tests (65e06a5

Main changes I made:

  • Created a new ResearchController with JWT protection for the endpoints to add sources and trigger research.
  • Updated the ResearchService to handle adding source URLs, update project metadata, change status to “researching”, and start the NotebookLM audio/video overview generation.
  • Added proper validation so only valid lists of URLs can be sent and improved error handling for the whole flow.

While working on this part, I recently changed the NotebookLM CLI tool I was planning to use. I went back to YouTube videos and GitHub to find the original repo again, and ended up discovering another one: https://github.com/jacob-bd/notebooklm-mcp-cli. This one is written in Python. Since I already know Python, I prefer it here. It also looks more complete with extra features for managing notebooks, sources, and generating studio content.

I had already been using NotebookLM for some time and I think it’s a really good tool for studying. Now that I’m studying Computer Engineering, I believe I’ll use it a lot. The video overviews it creates are especially interesting and high quality. That’s why I thought integrating these features directly into VideoLM would be cool — it should bring Google-level quality to the videos we generate. So far, the initial parts of this integration are done and working well.

Attachment
Attachment
0
ChefThi
  • refactor: implement user-based project filtering & route protection (0da0eed)
  • feat: implement research infrastructure for source ingestion (d096fa1)

Research Infrastructure and NotebookLM Integration Foundation

The project gained a new research layer with the addition of source ingestion infrastructure. This allows the system to receive and store external sources (such as URLs or text) for each project and introduces a dedicated research module.

Main changes included:

  • Added a ResearchModule and ResearchService to handle source management and research tasks.
  • Extended the Project entity to support a list of sources and a new “researching” status.
  • Implemented basic methods to add sources and start a NotebookLM research process.

Recently, while watching videos on YouTube and exploring open-source projects on GitHub, the author discovered the nlm project, a command-line interface for Google’s NotebookLM. I had already been using NotebookLM for some time and found it to be an excellent tool for studying and organizing information. Now that I’m is studying Computer Engineering, NotebookLM is expected to be used even more frequently. The video overviews produced by NotebookLM are particularly interesting and high-quality.

This led to the idea of integrating NotebookLM features directly into AI Video Factory. Having NotebookLM’s capabilities — especially the ability to generate deep, well-structured audio overviews — inside the system would bring Google-level quality to the generated videos. So far, the initial parts of this integration have been implemented and are working well.

Attachment
Attachment
Attachment
Attachment
0
ChefThi
  • feat: implement SaaS identity foundation & hardened multipart production pipeline (ac49650)

The project took a major step toward becoming a real SaaS application with the addition of a user authentication system. This foundation allows users to create accounts, log in, and keeps their generations secure and separated.

Main changes included:

  • Implemented JWT-based authentication with registration and login endpoints.
  • Protected the AI and video generation routes so only logged-in users can access them.
  • Added support for users with basic quota tracking.
  • Hardened multipart file uploads with higher size limits and better validation to handle background music, audio tracks, and multiple images reliably.
  • Increased server timeouts and payload limits to support longer video rendering without interruptions.

These updates make the system more professional and prepare the ground for paid plans and user management in the future.

Early in the project, Docker and DevOps concepts needed significant learning and adjustment. Considerable time was also spent refining the FFmpeg configuration for reliable video assembly.

AI served as an accelerated learning companion rather than a replacement for hands-on work. Like in JWT system and learning this pratices

Attachment
Attachment
Attachment
Attachment
Attachment
Attachment
0
ChefThi
  • chore: refine core dependencies & standardize AI service provider logic (60bd97e)

Spent the last 30 minutes on core infrastructure alignment. I standardized safety thresholds across all AI providers (Gemini, Hugging Face, Pollinations) to BLOCK_NONE, ensuring creative prompts aren’t throttled by false-positive safety filters. I also refined the VideoController file interceptors to scale from 20 up to 100 simultaneous scene uploads, preparing the engine for long-form content generation instead of just short clips. Finally, I synchronized the server dependencies and generated an industrial-grade lockfile to ensure environment parity between local development and production Docker builds. The backend is now clean, consistent, and ready for the upcoming authentication layer.

Many errors comeback. But this is process…

Attachment
Attachment
0
ChefThi
  • refactor: enhance backend resilience and establish identity foundation (5481ba8)

This session focused on hardening the backend infrastructure and preparing the project for its transition into a multi-user SaaS platform.

🛠 Technical Achievements

1. Advanced Backend Resilience

We addressed critical sync issues between the frontend and the database.

  • Auto-Project Provisioning: Refactored ProjectsService to implement an “Auto-Create” logic. The backend now automatically handles new project IDs generated by the frontend, eliminating the 404 “Project Not Found” errors during assembly.
  • Video Pipeline Hardening: Updated VideoService with automated directory management and granular error catching. This ensures that intermediate assets are correctly stored and managed, preventing the unhandled 500 errors previously encountered during the “Asset Preparation” stage.

2. Identity & Data Isolation Layer

Started the foundational work for the SaaS transition.

  • User Architecture: Implemented the UserEntity and the base structure for the AuthModule (Passport/JWT).
  • Relational Mapping: Projects are now linked to user profiles via TypeORM, ensuring that video generations are securely tied to their respective owners.

⏱ Hackatime & Sidequest Note

While monitoring my progress for the 10-hour LockIn sidequest, I discovered a discrepancy in my tracked time. I realized that I needed to explicitly link this project’s activity in Hackatime to ensure that every hour of refactoring and architectural design is correctly logged and counted toward the competition milestones.

Attachment
0
ChefThi

Fix Video Assembly Error 500
We identified that the video assembly was failing due to a synchronous request pattern and strict file upload limits.

  • Payload Scaling: Increased the image upload limit in the VideoController to support up to 100 scenes per video, preventing server-side rejection.
  • Memory Optimization: Switched from direct stream piping to a background worker response. The backend now acknowledges the request immediately, preventing the browser from timing out.

Since the video rendering now happens in the background, the frontend was updated to stay in sync without locking the UI.

  • Async Tracking: Refactored App.tsx and ffmpegService.ts to implement a polling mechanism. The frontend now queries the /status endpoint every 10 seconds to monitor background progress.
  • URL Resolution: Added logic to resolve relative video paths from the backend, ensuring the final .mp4 is correctly displayed in the ResultView once ready.
Attachment
Attachment
0
ChefThi
  • feat: add projectId tracking and assembly status polling (3fb800d)

The pipeline was improved with better project tracking and asynchronous video assembly. Each generation now receives a unique project identifier, which helps organize and monitor the entire process from start to finish.

Main changes:

  • Added projectId tracking throughout the frontend and backend.
  • Changed video assembly to run in the background instead of blocking the interface.
  • Implemented status polling so the user interface automatically checks when the final video is ready.
  • Improved error handling and added retry logic for image generation to make the flow more stable.

These updates make the system feel smoother and more professional, especially when generating longer videos. The frontend no longer waits locked during the FFmpeg processing step.

Early in the project, Docker and DevOps concepts needed significant learning and adjustment. Considerable time was also spent refining the FFmpeg configuration for reliable video assembly.

The system got this error… ❌ Error in stage Asset Preparation: Backend error: 404 {“message”:“Project proj_1775505149988 not found”,“error”:“Not Found”,“statusCode”:404} I need to fix this

Attachment
Attachment
0
ChefThi
  • feat: integrate gemini 2.5 flash multimodal image engine & refactor video assembly controller (dc475d9)

Devlog: Gemini 2.5 Flash Image Integration and Pipeline Refactor

The project received a significant update today with the integration of Gemini 2.5 Flash as a new multimodal image generation engine. This addition strengthens the visual creation part of the pipeline and improves overall reliability.

Main changes included:

  • Added support for Gemini 2.5 Flash to generate images directly from text prompts, using a 16:9 aspect ratio and high-quality PNG output.
  • Refactored the video assembly process to run in the background and return a video URL instead of streaming the file directly.
  • Updated several parts of the code for better stability, including fixes in the FFmpeg configuration and test scripts.

These improvements build on the previous parallel rendering engine and make the system more modular and ready for future scaling.

Early in the project, Docker and DevOps concepts required a lot of learning and adjustment. Considerable time was also spent refining the FFmpeg setup to handle video assembly correctly.

Media to be attached/linked:

  • Screen recording of the full system flow, now using the new Gemini 2.5 Flash image generation.
  • Sample videos generated with the updated pipeline to show improved visuals and background processing.
Attachment
Attachment
0
ChefThi
  • feat: implement industrial-grade background rendering & static ffmpeg distribution (c60930f)

The pipeline went from “works if I don’t breathe” to something that runs unattended. No new features — just tearing down everything that could kill a long render.

What changed
Background Worker — The controller used to hang the request, run FFmpeg, and pipe the stream into the response. Close the tab = lose everything. Now it fires in background and returns JSON with projectId + future videoUrl. Video lands in server/public/videos/, served statically by NestJS. Close the browser, the server keeps going.

Static FFmpeg — Bundled ffmpeg-static and ffprobe-static. VideoService constructor sets paths via ffmpeg.setFfmpegPath(). Zero external dependency. Dockerfile still installs libfontconfig1 and libfreetype6 for text filters, but the binary is ours.

15min Timeout — Node default is 2min. Added server.setTimeout(900000) so image/audio uploads don’t get killed mid-transfer.

Clean Dockerfile — 3 stages: frontend-builder (Vite), backend-builder (TS), production (slim, artifacts only). Migrated from node:18-alpine to node:20-slim — Alpine was causing native module headaches.

Validation — BadRequestException when audio or images are missing. Before this, the error only surfaced deep inside FFmpeg as a cryptic “No such file”.

WakaTime sync
Recovered lost hours today. Reinstalling the extension and switching directories broke the project identity — hours scattered across 5+ phantom entries. Fixed it by adding .wakatime-project at repo root. Lesson: this file is the .gitignore of time tracking. It belongs in commit

0
ChefThi

New Feature: Get DB Pipeline

Background Rendering: We’ve moved away from the model where the browser would “hang” waiting for the video. Now, the backend starts a background worker. The user can close the tab, press F5, or even restart the PC; the server continues rendering the video silently.

End of 503 Timeout: I adjusted the server to a 15-minute connection limit (900000ms). Long videos are no longer interrupted by network limits.

Real Persistence: The video is no longer just a temporary “blob.” It is now physically saved in server/public/videos/ and the link is registered in SQLite.

Static Asset Service: We configured NestJS to serve these videos via a fixed URL, allowing the user to retrieve their creations at any time from the gallery.

I’ve spent the last few hours solving a critical UX problem: the loss of progress in long renders. Implementing a background worker architecture with disk persistence transformed the app from a prototype into a real production tool.

0
ChefThi
  • chore: update gitignore to include private tool sandboxes (6e6052a)

Minor Chore Update and Preparation for Parallel Rendering Engine

A small but necessary chore update was applied to the repository: the .gitignore file was adjusted to properly exclude private tool sandboxes and temporary workspaces. This prevents accidental commits of sensitive or environment-specific files during active development.

At this commit, the project structure already includes:

  • A dedicated devlogs/ directory for technical progress records.
  • Clear references in the README to participation in Hackatime (Flavortown), emphasizing the value of documented development steps.

No functional changes were made to the pipeline in this specific commit, but it immediately precedes the implementation of the parallel rendering engine, project metadata tracking, and auto-cleanup features.

Media to be attached:

  • Full system screen recording demonstrating the current end-to-end flow (topic → script → visuals → narration → final video assembly) using the React UI.
  • Sample videos generated by the system (short 60-second example + one longer test video) showcasing narration quality, image redundancy, smart subtitles, and background music ducking.

These recordings highlight the stability achieved after the recent industrial-grade backend refactor and prepare the ground for the upcoming parallel rendering improvements.

0
ChefThi
  • refactor: industrial grade backend architecture, image redundancy & resilience (9b939b0)

**Backend Architeture & Resilience
(focused refactor session)

The backend architecture received a major upgrade to industrial-grade standards. Image generation now operates with full redundancy and resilience layers, eliminating single-point failures that previously caused 503 errors and rate-limit interruptions during long runs.

Key improvements implemented:

  • Refactored core services into a modular, fault-tolerant structure with multiple image providers running in parallel.
  • Added automatic fallback between Gemini (primary) and Hugging Face/OpenRouter, including token rotation and exponential backoff retries.
  • Enhanced frontend retry logic (already stable since mid-March) to gracefully handle transient failures without breaking the user flow.
  • Pipeline now supports true parallel image generation (Turbo Mode) while maintaining sync with ffprobe-based audio/video timing.

Problems addressed:

  • Previous dependency on single LLM/image endpoints led to frequent pipeline crashes on extended videos (7+ minutes).
  • Inconsistent media assembly under high load was resolved through clip-level rendering and smart ducking/subtitle synchronization.
  • Docker environment stability was further hardened with environment-variable configs and global exception handling.
    It’s kind of like that :)

Had this 504 error, the formulation and preparation, but the tests in general were good

Attachment
0
ChefThi

🚀 It’s Alive: Script to Video in One Click

the Factory just crossed the line from a “cool experiment” to a functional tool. The core engine is finally humming.


What’s new (and why it took a minute):

  • No more CORS headaches: I moved all media generation (Gemini & Hugging Face) to the NestJS backend. It’s cleaner, safer, and supports automatic token rotation. If an API key hits a limit, the system just swaps to the next one without breaking the flow.
  • Better Visuals (FLUX.1): Swapped generic images for FLUX.1-schnell. The pipeline now generates storyboards that actually match the script’s vibe instead of just “looking okay.”
  • Clean Narration: Integrated Gemini’s native TTS. It’s producing crystal-clear audio that’s perfectly synced with the auto-generated captions (SRT).
  • Built to last: The pipeline can now handle 7+ minute videos. I added smart batching and exponential backoff retries—so if an image service hiccups, the system fights to stay alive instead of just crashing.
Attachment
0
ChefThi

Balancing college, buses and FFmpeg: finally shipping an end-to-end video pipeline 🎓🚌

Over the last two months AI Video Factory was my “background process”. I had just started my Computer Engineering degree, and the campus is about 10 km from home, so most days were: bus → classes → bus → quick late-night coding sessions. On top of that I was also juggling Blueprint hardware projects, so I decided to work on this in focused bursts instead of constant tiny commits.

Most of the progress happened off-Git: I kept iterating on the FFmpeg pipeline, breaking it, fixing it, and using Perplexity as a kind of “technical rubber duck” to reason about filter graphs, error messages and timing issues. I didn’t want to push half-broken experiments all the time, so I waited until things felt structurally solid before committing.
In this latest round of changes I finally wired the full end-to-end pipeline: script → images → audio → video. I refactored image and audio generation into clearer modules and fixed a couple of nasty production issues: zoompan freezing on long chains, bad subtitle timing, and 503s during long renders. The solution involved rendering clips individually, using ffprobe for real audio duration, and switching to character‑weighted subtitle timing so the pacing feels natural.

I also hardened the Docker environment: proper SQLite permissions, config via env vars, and better logging through a global exception filter in the NestJS backend. Now, when something explodes, it explodes with logs instead of silently failing. 😅

This devlog is basically the “catch‑up chapter” for everything that happened between classes, buses and late‑night debugging. The next step is polishing the UI and shipping a public demo link.

Attachment
Attachment
Attachment
Attachment
0
ChefThi

Título: Melhorias de áudio, documentação e timeouts
Data: 2026-01-10

Commits (hashes):
3bb12a6 ee1f5e3 d3c80a4

Resumo:
Trabalhei em três frentes diretas após os commits 3dbbaf16 / be0f105a: áudio inteligente (ducking), atualização de documentação/testes e aumento de timeouts do servidor para reduzir erros 503 em renderizações longas.

O que foi feito:

  • 3bb12a6 — Implementado Smart Ducking no pipeline de vídeo: agora a mixagem reduz automaticamente o volume da música de fundo quando a narração está ativa, com curvas de ganho suaves para evitar cortes abruptos. Adicionei testes unitários cobrindo a lógica de mixagem e validação de níveis RMS para garantir que ducking não degrade a fala. NÃO TESTADO!
  • ee1f5e3 — Atualizei docs e refinei testes: status do projeto ajustado, casos de teste do VideoService/AIService ampliados e pequenas correções nos scripts de teste (mais mensagens claras nos asserts).
  • d3c80a4 — Aumentei timeout do servidor para 15 minutos e confirmei timeouts longos no proxy do Vite; objetivo: reduzir timeouts 503 durante jobs de processamento de vídeo grandes.

Resultados:

  • Experimentos locais mostram áudio mais claro em saídas com BGM + narração e transições sem artefatos.
  • Testes automatizados fortalecidos (cobertura crítica mantida) — menos regressões ao ajustar mixagem/FFmpeg.
  • Redução observada de falhas por timeout em runs manuais longos (a validar em CI).

Próximos passos:

  • Rodar E2E com pipeline completo em CI (docker-compose) para confirmar estabilidade do timeout ampliado.
  • Medir impacto do ducking em diferentes BGM (multi-gênero) e ajustar parâmetros padrão.
  • Expor métricas de nível de áudio (RMS/peak) no VideoGateway para monitoramento em tempo real.
Attachment
Attachment
Attachment
Attachment
Attachment
Attachment
Attachment
0
ChefThi

Data: 2026-01-09

Commits cobertos (hashes):
3dbbaf16 be0f105a

Resumo:
Após as melhorias de estabilidade e limpeza, foquei em tornar o projeto testável e executável em container. Adicionei testes unitários, preparei imagens Docker e corrigi problemas de execução no ambiente containerizado para garantir que o pipeline possa rodar localmente e em CI com consistência.

Detalhes por commit:

  • 3dbbaf16 — feat: complete unit tests and docker configuration

    • Adição e correção de testes para VideoService, ProjectsService e AiService; cobertura acima de 60%.
    • Criação de docker-compose.yml e Dockerfile para o servidor; adição de .dockerignore.
    • Estrutura de containerização pensada para isolar banco (SQLite no container), serviços e facilitar builds locais/CI.
    • Objetivo: permitir execução reproducível do backend e integração com frontend via proxy.
  • be0f105a — fix: Depuração completa e estabilização do ambiente Docker

    • Ajustes de permissões do arquivo de banco de dados para evitar erros de escrita em container.
    • Movi configuração do BD para variáveis de ambiente (melhor segurança e flexibilidade).
    • Resolvido conflito de dependência do Express que quebrava o container.
    • Limpeza de cache Docker para recuperar espaço e evitar builds corrompidos.

Impacto:

  • Ambiente Docker agora inicializa de forma confiável e o backend executa com a mesma configuração esperada na CI.
  • Testes unitários cobrem componentes cruciais do pipeline — reduz risco de regressões ao mexer em FFmpeg/IA.
  • Menos atrito para colaboradores: com docker-compose é mais fácil replicar o ambiente localmente.

O que testar / próximos passos:

  • Executar pipeline completo dentro do container (geração de script → TTS → imagens → assemble) para validar timeouts e recursos.
  • Adicionar testes E2E que rodem em CI usando o docker-compose.
  • Monitorar o uso de disco em runners/containers e automatizar limpeza de caches em pipelines.
Attachment
Attachment
Attachment
Attachment
Attachment
Attachment
0
ChefThi

Título: Robustez do Pipeline e Sincronia Audiovisual
Commits: 75f531a, 17c3a84

Resumo:
Foco em estabilidade do engine FFmpeg e precisão na sincronização áudio/legenda. Eliminamos travamentos de memória em vídeos longos e bugs de timing, além de limpar artefatos de tracking no repositório.

O que foi feito:

  • Render por Clipe (75f531a): Fragmentação da cadeia monolítica do zoompan em etapas individuais. Imagens são processadas como clipes MP4 isolados e unidas via concat demuxer, evitando congelamentos de frame e estouro de memória.
  • Sincronia via ffprobe (75f531a): Implementação de sondagem nativa de áudio no backend. Agora o sistema obtém a duração real do arquivo, corrigindo desvios causados por estimativas do frontend.
  • Legendas Inteligentes (75f531a): Novo algoritmo de peso por caracteres (character-weighted). O tempo de cada legenda agora é proporcional ao tamanho do texto, resultando em leitura fluida e natural.
  • Padding & Erros (75f531a): Adição de +5s de segurança no clipe final para evitar cortes abruptos. Criado filtro global de exceções para debug em server.log.
  • Frontend Sync (75f531a): Detecção real de duração via Audio API (fim do placeholder de 60s) e inclusão automática de subtitles.srt no ZIP de saída.
  • Limpeza (17c3a84): Normalização do .gitignore e remoção de arquivos temporários do tracking do Git.

Resultados:

  • Fim dos travamentos de renderização em sequências longas.
  • Legendas perfeitamente sincronizadas com a narração.
  • Repositório limpo, focado apenas em código produtivo.

Testes:

  • Montagem com mix de formatos (PNG/JPG) e áudios reais; verificação de saída MP4/SRT e integridade do ZIP.

Próximos Passos:

  • Validar pipeline com cargas de 50+ imagens.
  • Expor métricas via WebSocket para triagem em tempo real.
Attachment
Attachment
Attachment
Attachment
0
ChefThi

O commit d977f4e marca a transição de protótipo para MVP full-stack. Implementei três pilares arquiteturais críticos para robustez e
UX:

  1. Persistência de Dados (TypeORM + SQLite)
    Substituí a volatilidade do navegador por um banco de dados real.
  • Backend: Implementação do ProjectsModule com operações CRUD completas (/api/projects).
  • DB: homes.db (SQLite) gerenciado via TypeORM com sincronização automática de schema.
  • Impacto: Usuários agora podem salvar, listar e retomar projetos anteriores. O estado persiste entre sessões e recargas de página.
  1. Feedback em Tempo Real (WebSockets)
    Resolvi a “caixa preta” de processos longos usando socket.io.
  • Arquitetura: VideoGateway no NestJS emite eventos de progresso (scriptProgress, videoProgress) para o frontend.
  • UX: O usuário visualiza o pipeline exato: “Gerando Imagens (3/10)” -> “Renderizando (45%)” -> “Concluído”.
  • Tech: Handshake otimizado com configurações CORS específicas para permitir comunicação Vite (5173) <-> NestJS (3000).
  1. Centralização de IA (Backend-First)
    Movi 100% da lógica de IA para o servidor, eliminando exposição de chaves no cliente.
  • Módulo: Novo AiModule encapsula geminiService.ts e serviços de TTS/Imagem.
  • Fluxo: Frontend consome endpoints REST limpos (POST /api/ai/script), enquanto o backend gerencia quotas, retries e rotação de
    chaves de API com segurança.

Stack & Métricas:

  • Novas Deps: @nestjs/typeorm, sqlite3, @nestjs/websockets, socket.io.
  • Arquivos: +8 módulos principais (ai.module.ts, video.gateway.ts, project.entity.ts).
  • Desafios Vencidos: Configuração fina de CORS para WSS e sincronização de entidades TypeORM em runtime.
Attachment
Attachment
Attachment
Attachment
Attachment
0
ChefThi

Commits (hashes):
a673d30 até f83c967
Desde o ponto marcado por acc5ab9 concluí uma série de mudanças que transformaram a base em um pipeline mais resiliente e com melhor experiência de desenvolvimento. A ênfase foi em três frentes: (1) DX / Dev Mode para testes rápidos com pacotes ZIP, (2) orquestração e fallback de geração de imagens com processamento em batch, e (3) robustez do FFmpeg e infraestrutura backend.

O Dev Mode foi melhorado: upload de ZIPs agora extrai no cliente, auto-start do fluxo e carregamento automático de script, áudio e imagens locais para acelerar testes. No frontend ajustei o form (duração padrão, seleção de bg music) e introduzi processamento por lotes (batch) para geração de imagens — isso permite paralelizar requisições e aplicar fallback simples quando uma imagem falha, mantendo a ordem final. Acrescentei timeouts e fetchWithTimeout nas chamadas a provedores de imagem para evitar travamentos longos.

Na camada de imagem, o ImageGeneratorPro e a estratégia de rotação entre provedores foram reforçados para reduzir falhas por quota (Gemini → HF → StableDiffusion → Pollinations → Replicate). Também limpei guias e arquivos antigos, reorganizei .gitignore e adicionei ferramentas para reproducibilidade (Nix idx, rescue scripts).

No backend houve evolução significativa: adicionei um módulo Projects (TypeORM + SQLite) para persistir projetos; ampliei VideoService com geração SRT dinâmica, mixagem opcional de música de fundo, probe de duração de áudio, e um grafo de filtros FFmpeg mais robusto. As correções de FFmpeg continuam (stream normalization, mapeamento explícito, reset de PTS e aumento para 30fps), além de melhorias de erro/cleanup (remoção de SRT temporário, verificações de saída). Tempo de timeout do servidor e proxy estendido para suportar jobs longos.

Resultados: pipeline gera vídeos mais estáveis (30fps, sem drops), Dev Mode permite iteração rápida com assets locais, e a orquestração de imagens tolera quedas de provedores.

Attachment
Attachment
Attachment
Attachment
0
ChefThi

Título: 🚀 Hardening the Core & Subtitles
Data: 2026-01-04

Commits:

Resumo:
Hoje trabalhei para estabilizar o pipeline e melhorar o suporte a vídeos com legendas automáticas. Também refinei o ambiente de desenvolvimento para evitar conflitos futuros.

O que foi feito:

  • Legendagem Automática:
    • Gerador SRT dinâmico baseado no script gerado pela IA e timing do áudio.
    • “Queima” (hard-code) das legendas no vídeo usando FFmpeg, com estilo legível (fonte neon ciano + bordas pretas).
  • Estabilização do Ambiente:
    • Agora o backend usa ffprobe para verificar com precisão a duração do áudio antes da renderização.
    • Otimized proxy e tempo de execução do dev server (Vite) para tarefas longas.
  • Gerenciamento de Assets Locais:
    • Parou o versionamento de arquivos como GEMINI.md, mantendo-os locais apenas com exclusões no .gitignore.

Resultados:

  • Vídeos agora podem ser gerados com legendas legíveis e sincronizadas.
  • Ambiente Dev mais estável e otimizado para casos de uso local.
  • Arquivos redundantes não comprometem mais o repositório principal.

Próximos Passos:

  • Testar variados estilos de legendas para legibilidade em formatos diferentes.
  • Finalizar suporte para mixagem de áudio de fundo no pipeline.
  • Outras otimizações possíveis no fluxo de geração de legendas.
Attachment
Attachment
Attachment
Attachment
Attachment
0
ChefThi

O que eu shippei:

  1. Dynamic Motion (Ken Burns): Vídeos estáticos são chatos. Implementei filtros complexos no FFmpeg (zoompan, crop, scale) para dar
    movimento (pan & zoom) automático a todas as imagens geradas pela IA. Agora parece um documentário real, não um slide de
    PowerPoint.
  2. Robust Image Orchestrator: O pipeline estava quebrando quando a API do Gemini dava rate-limit. Criei um sistema de Fallback em
    Cascata: se o Gemini falhar, ele tenta HuggingFace, depois Stable Diffusion, Replicate e finalmente Pollinations. O vídeo sempre
    sai.
  3. DX (Developer Experience): Testar pipeline de IA é caro e lento. Criei um “Dev Mode” que injeta assets locais (ZIP) direto no
    pipeline, pulando as chamadas de API. Isso acelerou meu ciclo de testes de 2 minutos para 10 segundos.

Stack: React + NestJS + FFmpeg + Gemini 2.5 Flash.

Novas atualizações shippadas:

  1. Instant ZIP Pipeline: Implementei um sistema de “Auto-Start”. Agora, ao selecionar um arquivo ZIP com assets pré-gerados, o
    sistema detecta os arquivos, faz o upload e inicia a montagem do vídeo automaticamente. Menos cliques, mais velocidade. ⚡
  2. Smart Validation Bypass: Removi a obrigatoriedade de inputs de IA (como o tópico do vídeo) quando o Modo Dev está ativo. O sistema
    entende que os assets locais são a “única fonte da verdade”, limpando a interface de campos desnecessários.
  3. Local Asset Mapping: Melhorei a lógica de extração no backend para garantir que, independente de como o ZIP foi estruturado, o
    pipeline localize corretamente o script, áudio e o storyboard.
  4. GitHub Push Protection: Tivemos um pequeno susto com um segredo detectado pelo GitHub, mas resolvi via git reset e reescrita de
    histórico para manter o repositório seguro e limpo. 🔒
Attachment
0
ChefThi

O que foi feito hoje:
Integração com múltiplos provedores de imagem ( 4816e90):

Adicionado suporte para Gemini Imagen 3 , Hugging Face , Stable Diffusion , Craiyon e Replicate .
Criado o componente ImageGeneratorPropara geração avançada de imagens.
Adicionadas novas bibliotecas e atualizações de serviços auxiliares ( pollinationsService.tse imageService.ts).
Melhoria na interface do usuário ( 6d24499):

Substitui o controle deslizante de duração pela entrada de valores numéricos e predefinidos, simplificando o uso.

Attachment
0
ChefThi

Hoje foi um dia crucial na configuração final do AI Video Factory , meu projeto para o Flavortown.
Concluí configurações importantes para garantir que toda a estrutura do pipeline seja funcional, desde a entrada de dados até a geração automatizada de vídeos.
TRABALHEI EM ALGUMAS COISAS MAS ESQUECI DE GRAVAR O PROGRESSO. É MAIS OU MENOS ISSO QUE FIZ.

Usei o Gemini CLI para me dar me guiar e ir desenvolvendo as coisas enquanto organizava.

O que foi realizado hoje:
Configuração inicial e documentação ( a673d30):

Ajustei a base do projeto, garantindo que tanto o backend quanto o frontend estejam funcionando em harmonia.
Atualizei o README.mdpara incluir:
Guia completo de instalação local com suporte ao Docker.
Passo a passo sobre o uso do pipeline de automação.
Documentação detalhada dos endpoints da API de IA (geração de roteiro, visual e narração).
Estrutura do projeto e refinamento para o Flavortown ( 7b536d71, 6eda3fba):

Organizei melhor a estrutura de pastas e otimizei a configuração do Dockerfile para evitar conflitos no ambiente de execução.
Corrigidos pequenos bugs encontrados durante os testes de build do Docker e execução local.
Correção de erros durante os testes ( 6a03c43f):

Ajustei variáveis ​​de ambiente no .env.examplepara facilitar integrações futuras.
Solucionei problemas com as dependências relacionadas ao FFmpeg e integração com a API Gemini .

Attachment
0
ChefThi

Hoje avancei na estruturação do projeto AI Video Factory para o Flavortown!

Conquistas de hoje:

Estrutura inicial: Organizando pastas para Backend (NestJS) e Frontend (React + Vite).
Configuração: Adaptei variáveis ​​de ambiente e integrais ao FFmpeg ao pipeline.
Documentação: Completo README.mdcom o diagrama do pipeline e instruções para rodar o projeto.
Próximo passo: Finalizar a integração de scripts e narração para gerar o primeiro vídeo automaticamente!
Commits de Ontem (27 de Dezembro de 2025):
76329a9- Revise o arquivo README com detalhes do projeto e instruções de configuração.

O que foi feito:
Atualização completa do README.md:
Resumo do projeto
Recursos e pilha tecnológica usados
Passo a passo para instalação e configuração
Pipeline do projeto do início ao fim
Documentação dos endpoints da API
ff55797- Primeiro envio dos arquivos

O que foi feito:
Subida inicial do projeto:
Estruturação básica de pastas e arquivos.
Subiu o esqueleto do frontend e backend.
Incluiu arquivos como Dockerfile, .env.example, .gitignore.
Compromisso de Hoje (28 de Dezembro de 2025):
a673d30- Tarefa: configuração inicial do projeto e documentação para Flavortown
O que foi feito:
Ajustes finais para a configuração do projeto.
Melhorias na documentação, adaptando o projeto para o concurso Flavortown.
Preparação de ambiente local e explicação para desenvolvedores externos.

Attachment
0
ChefThi

O que foi feito: Ontem foi o “Big Bang” do projeto AI Video Factory. Eu foquei em estabelecer toda a fundação técnica para transformar um tópico qualquer em um vídeo completo para o YouTube de forma automatizada.

Destaques técnicos dos commits:

Subi os arquivos base do que espero ser o projeto. Sendo a estrutura base de tudo o que vou desenvolver.

Documentação e Setup: Finalizei o dia revisando o README.md com todos os endpoints da API (ideação, script, narração, montagem) e as instruções de setup via Docker, garantindo que o projeto seja replicável e “shipável” — bem no espírito do Flavortown.

Commit ff55797 (First push of the files):
Subiu o “coração” do projeto.
Estrutura de pastas separando Frontend e Backend.
Configuração de ambiente (.env.example) e arquivos de container (Dockerfile).
Commit 76329a9 (Revise README with project details):
Detalhamento da Pipeline Architecture.
Exposição dos endpoints /api/ai/ e /api/assemble.
Guia de instalação completo para quem quiser testar a “fábrica”.

Attachment
0