Skip to content

GPU Compatibility

ComfyUI Studio supports NVIDIA GPUs from Volta (2017) through Blackwell (2024). The Docker image build arguments must be set correctly for your target GPU to ensure CUDA compatibility, correct PyTorch wheels, and appropriate attention optimizations.

Full Compatibility Table

Blackwell (SM 100)

GPU VRAM CUDA_VERSION PYTORCH_INDEX SageAttention FlashAttention
B200 192 GB 12.8.1 cu128 v2 (FP8) v3
B100 80 GB 12.8.1 cu128 v2 (FP8) v3

Hopper (SM 90)

GPU VRAM CUDA_VERSION PYTORCH_INDEX SageAttention FlashAttention
H200 141 GB 12.8.1 cu128 v2 (FP8) v3
H100 80 GB 12.8.1 cu128 v2 (FP8) v3
H100 NVL 94 GB 12.8.1 cu128 v2 (FP8) v3

Ampere -- Datacenter (SM 80 / SM 86)

GPU VRAM SM CUDA_VERSION PYTORCH_INDEX SageAttention FlashAttention
A100 (80 GB) 80 GB 80 12.4.1 cu124 v1 v2
A100 (40 GB) 40 GB 80 12.4.1 cu124 v1 v2
A6000 48 GB 86 12.4.1 cu124 v1 v2
A5000 24 GB 86 12.4.1 cu124 v1 v2
A4000 16 GB 86 12.4.1 cu124 v1 v2

Ada Lovelace -- Datacenter (SM 89)

GPU VRAM CUDA_VERSION PYTORCH_INDEX SageAttention FlashAttention
L40S 48 GB 12.4.1 cu124 v1 v2
L40 48 GB 12.4.1 cu124 v1 v2
L4 24 GB 12.4.1 cu124 v1 v2

Ada Lovelace -- Consumer (SM 89)

GPU VRAM CUDA_VERSION PYTORCH_INDEX SageAttention FlashAttention
RTX 4090 24 GB 12.4.1 cu124 v1 v2
RTX 4080 SUPER 16 GB 12.4.1 cu124 v1 v2
RTX 4080 16 GB 12.4.1 cu124 v1 v2
RTX 4070 Ti 12 GB 12.4.1 cu124 v1 v2
RTX 4070 12 GB 12.4.1 cu124 v1 v2
RTX 4060 Ti 16 GB 12.4.1 cu124 v1 v2
RTX 4060 8 GB 12.4.1 cu124 v1 v2

Ampere -- Consumer (SM 86)

GPU VRAM CUDA_VERSION PYTORCH_INDEX SageAttention FlashAttention
RTX 3090 24 GB 12.4.1 cu124 v1 v2
RTX 3090 Ti 24 GB 12.4.1 cu124 v1 v2
RTX 3080 Ti 12 GB 12.4.1 cu124 v1 v2
RTX 3080 10 GB 12.4.1 cu124 v1 v2
RTX 3070 Ti 8 GB 12.4.1 cu124 v1 v2
RTX 3070 8 GB 12.4.1 cu124 v1 v2
RTX 3060 12 GB 12.4.1 cu124 v1 v2

Turing (SM 75)

GPU VRAM CUDA_VERSION PYTORCH_INDEX SageAttention FlashAttention
RTX 2080 Ti 11 GB 12.1.1 cu121 -- --
RTX 2080 SUPER 8 GB 12.1.1 cu121 -- --
RTX 2080 8 GB 12.1.1 cu121 -- --
RTX 2070 8 GB 12.1.1 cu121 -- --
T4 16 GB 12.1.1 cu121 -- --

Volta (SM 70)

GPU VRAM CUDA_VERSION PYTORCH_INDEX SageAttention FlashAttention
V100 (32 GB) 32 GB 12.1.1 cu121 -- --
V100 (16 GB) 16 GB 12.1.1 cu121 -- --

Attention Optimization Summary

SageAttention

SageAttention replaces the default attention computation with optimized CUDA kernels, providing 2-3x faster generation.

Compute Capability Version Details
SM 90+ (Hopper, Blackwell) v2 FP8 attention kernels -- highest performance, uses 8-bit floating point
SM 80-89 (Ampere, Ada Lovelace) v1 Optimized attention kernels -- significant speedup over default
SM 75 and below (Turing, Volta) Not supported Set ENABLE_SAGE_ATTENTION=false

FlashAttention

FlashAttention provides memory-efficient fused attention, reducing VRAM usage and enabling larger batch sizes or higher resolutions.

Compute Capability Version Details
SM 90+ (Hopper, Blackwell) v3 Latest version, optimized for Hopper architecture
SM 80-89 (Ampere, Ada Lovelace) v2 Memory-efficient fused kernels
SM 75 and below (Turing, Volta) Not supported Set ENABLE_FLASH_ATTENTION=false

Universal Optimizations

These work on all GPUs regardless of generation:

Optimization Minimum SM Notes
xformers All listed GPUs Always installed in the image. Memory-efficient attention and other optimized operations.
torch.compile All listed GPUs Built into PyTorch. Most effective on Ampere+ but works everywhere.

Build Argument Mapping

Three sets of build arguments cover all supported GPUs:

Blackwell / Hopper

docker build -f docker/production/Dockerfile -t comfyui-studio .

All defaults. No --build-arg flags needed.

Ampere / Ada Lovelace

docker build -f docker/production/Dockerfile -t comfyui-studio \
  --build-arg CUDA_VERSION=12.4.1 \
  --build-arg PYTORCH_INDEX=cu124 \
  .

Turing / Volta

docker build -f docker/production/Dockerfile -t comfyui-studio \
  --build-arg CUDA_VERSION=12.1.1 \
  --build-arg PYTORCH_INDEX=cu121 \
  --build-arg ENABLE_SAGE_ATTENTION=false \
  --build-arg ENABLE_FLASH_ATTENTION=false \
  .

VRAM Recommendations

While ComfyUI Studio runs on any GPU listed above, VRAM determines what you can generate:

VRAM Capabilities
8 GB SD 1.5 image generation, small LoRAs, limited SDXL
12 GB SDXL image generation, most LoRAs, basic video (short clips)
16 GB Comfortable SDXL, CogVideoX I2V, quantized Flux
24 GB Full Flux, WAN 2.1 video, multiple LoRAs, LTX Video
40+ GB HunyuanVideo, full-precision large models, long video generation
80+ GB Multiple models loaded simultaneously, large batch sizes

Use the configurator

Rather than looking up your GPU in this table, run bash docker/configure.sh. It automatically selects the correct build arguments for your GPU and explains what each setting does.