Netflix just released its first public AI model, because why not

Netflix released its first public AI model on Hugging Face. Everyone is releasing AI models these days, so why not Netflix.

The model removes selected objects from video footage. The part that makes it different from standard inpainting tools is that it also removes the physical effects those objects caused. If a ball knocks something over before being deleted, the knocked-over object rights itself. Shadows, reflections, and collisions get erased along with the object, as if it was never in the frame.

The technical details:

Base model: Fine-tuned on CogVideoX-Fun-V1.5-5b, a 5 billion parameter video diffusion model from Alibaba
Input: Video, a text prompt describing the scene after removal, and a quadmask that marks which regions to remove, preserve, or treat as affected
Two-pass option: A second inference pass reduces artifacts and improves temporal consistency
Resolution: Up to 197 frames at 384×672
Hardware requirement: 40GB+ VRAM, so an A100 or equivalent

The model weights are available on Hugging Face under netflix/void-model. The code and inference scripts are on GitHub. It is not yet available through Hugging Face Inference Providers, so running it requires local setup.

Netflix describes it as research-oriented rather than production-ready. The obvious use case is VFX and post-production work, removing objects from shots without manual frame-by-frame cleanup.

Source: Hugging Face