ComfyUI’s new Dynamic VRAM lets you run larger AI models with less RAM

RAM prices are up, AI models are getting larger, and your hardware is struggling to keep up. ComfyUI just shipped an update that helps with all three problems at once.

comfyui logo

ComfyUI shipped Dynamic VRAM for Nvidia hardware on Windows and Linux, and it is already in the stable release. The update was built to help users on memory-constrained hardware keep running models that would otherwise require more RAM than they physically have.

Instead of loading all model weights into memory upfront, it only pulls in the weights it needs at the exact moment they are needed. If memory runs out, it handles the overflow gracefully rather than crashing. The moment another application needs that RAM, ComfyUI releases it instantly.

The benchmarks tell the real story.

Running WAN 2.2, a video model with 56GB of total weights, on a machine with only 32GB of RAM:

  • Without Dynamic VRAM: 283.7 seconds on the first run, 277.9 seconds on the second.
  • With Dynamic VRAM: 96.3 seconds on the first run, 83.2 seconds on the second.

That is a model three times the size of available RAM running more than three times faster. On 64GB RAM with FP16 weights, the improvement is even bigger — down from 193.1 seconds to 57.1 seconds.

Image generation on Flux 2 Dev shows similar gains.

  • First start: 41 seconds down to 24.4 seconds.
  • Change prompt: 33.8 seconds down to 18.8 seconds.
  • Load a LoRA: 43.4 seconds down to 19.9 seconds.

This is part of a broader pattern. The AI industry is actively racing to make large models run on less hardware.

ComfyUI is solving the same problem from the open source side — and doing it directly in response to rising RAM prices caused by memory manufacturers redirecting supply toward data center chips.

To get it, just update your ComfyUI installation. It is enabled by default in the latest stable release.

  • Portable install: Run update/update_comfyui.bat to update to the latest git version.
  • Desktop app: Auto-updates automatically if auto-update is on.
  • Recommended: PyTorch 2.10 cu130 for best performance.
  • If you run into issues: Add --disable-dynamic-vram to your startup command.

A few things worth knowing before updating.

  • Windows and Linux only: WSL support is not planned.
  • Nvidia only for now: AMD support is on the roadmap.
  • Higher VRAM usage is normal: The system keeps more in GPU memory deliberately. That is the point.

The Bottom Line: As RAM prices go up, developers keep coming up with creative ways to make their software compatible with less powerful hardware.

Source: ComfyUI

RunPod
RunPod

If you need on-demand GPUs for training, fine-tuning, inference, or running open-source models, give RunPod a try.

  • Available hardware: H100, H200, A100, L40S, RTX 4090, RTX 5090, and 30+ more
  • Cost: significantly cheaper than AWS or GCP, billed per second, no contracts
  • Setup: spins up in under a minute, 30+ regions worldwide
Try RunPod →
Affiliate disclosure: We may earn a commission if you sign up via our link, at no extra cost to you.
Efficienist Newsletter

Get the core business tech news delivered straight to your inbox. We track AI, automation, SaaS, and cybersecurity so you don't have to.

Just read what you want, and be done with it.

Read Next