ComfyUI’s new Dynamic VRAM lets you run larger AI models with less RAM

ComfyUI shipped Dynamic VRAM for Nvidia hardware on Windows and Linux, and it is already in the stable release. The update was built to help users on memory-constrained hardware keep running models that would otherwise require more RAM than they physically have.

Instead of loading all model weights into memory upfront, it only pulls in the weights it needs at the exact moment they are needed. If memory runs out, it handles the overflow gracefully rather than crashing. The moment another application needs that RAM, ComfyUI releases it instantly.

The benchmarks tell the real story.

Running WAN 2.2, a video model with 56GB of total weights, on a machine with only 32GB of RAM:

Without Dynamic VRAM: 283.7 seconds on the first run, 277.9 seconds on the second.
With Dynamic VRAM: 96.3 seconds on the first run, 83.2 seconds on the second.

That is a model three times the size of available RAM running more than three times faster. On 64GB RAM with FP16 weights, the improvement is even bigger — down from 193.1 seconds to 57.1 seconds.

Image generation on Flux 2 Dev shows similar gains.

First start: 41 seconds down to 24.4 seconds.
Change prompt: 33.8 seconds down to 18.8 seconds.
Load a LoRA: 43.4 seconds down to 19.9 seconds.

This is part of a broader pattern. The AI industry is actively racing to make large models run on less hardware.

NVIDIA recently published AutoGaze, which cuts video processing tokens by up to 100x.
Google shipped TurboQuant, compressing LLM memory by 6x with zero accuracy loss.

ComfyUI is solving the same problem from the open source side — and doing it directly in response to rising RAM prices caused by memory manufacturers redirecting supply toward data center chips.

To get it, just update your ComfyUI installation. It is enabled by default in the latest stable release.

Portable install: Run update/update_comfyui.bat to update to the latest git version.
Desktop app: Auto-updates automatically if auto-update is on.
Recommended: PyTorch 2.10 cu130 for best performance.
If you run into issues: Add --disable-dynamic-vram to your startup command.

A few things worth knowing before updating.

Windows and Linux only: WSL support is not planned.
Nvidia only for now: AMD support is on the roadmap.
Higher VRAM usage is normal: The system keeps more in GPU memory deliberately. That is the point.

The Bottom Line: As RAM prices go up, developers keep coming up with creative ways to make their software compatible with less powerful hardware.

Source: ComfyUI

Efficienist Newsletter

Get the core business tech news delivered straight to your inbox. We track AI, automation, SaaS, and cybersecurity so you don't have to.

Just read what you want, and be done with it.

ComfyUI’s new Dynamic VRAM lets you run larger AI models with less RAM

Anthropic doubles Claude Code limits after signing compute deal with SpaceX

A 12 million token LLM appeared out of nowhere, and the AI community isn’t sure what to make of it

Karpathy joins Anthropic, Google I/O delivers Gemini 3.5 and a 24/7 personal agent, Standard Chartered cuts 7,000 for AI

Federal Jury Dismisses Musk’s $150B OpenAI Claim in Under Two Hours

Cerebras IPO Soars 68%, Codex goes mobile, and Grok Build enters the coding agent race

Anthropic doubles Claude Code limits after signing compute deal with SpaceX

A 12 million token LLM appeared out of nowhere, and the AI community isn’t sure what to make of it