Supported Models
Complete table of all 126 models in the catalog, organized by feature area. All models are from HuggingFace unless noted otherwise.
WAN 2.1 / 2.2 (29 models)
WAN is Alibaba's open-source video generation architecture. The catalog includes both the original 2.1 release and the improved 2.2 release with dual-pass sampling.
Diffusion Models (13)
| Model |
File |
Precision |
Size |
Notes |
| WAN 2.1 I2V 480p 14B |
wan2.1_i2v_480p_14B_fp16.safetensors |
FP16 |
28.6 GB |
Original I2V model |
| WAN 2.1 T2V 14B |
wan2.1_t2v_14B_bf16.safetensors |
BF16 |
26.6 GB |
Text-to-video |
| WAN 2.1 T2V 1.3B |
wan2.1_t2v_1.3B_bf16.safetensors |
BF16 |
2.6 GB |
Small text-to-video |
| WAN 2.2 I2V High Noise 14B |
wan2.2_i2v_high_noise_14B_fp8_scaled.safetensors |
FP8 |
13.3 GB |
Comfy-Org repack |
| WAN 2.2 I2V Low Noise 14B |
wan2.2_i2v_low_noise_14B_fp8_scaled.safetensors |
FP8 |
13.3 GB |
Comfy-Org repack |
| WAN 2.2 I2V High Noise 14B |
wan2.2_i2v_high_noise_14B_fp16.safetensors |
FP16 |
28.6 GB |
Comfy-Org repack |
| WAN 2.2 I2V Low Noise 14B |
wan2.2_i2v_low_noise_14B_fp16.safetensors |
FP16 |
28.6 GB |
Comfy-Org repack |
| WAN 2.2 I2V High Noise 14B Kijai |
Wan2_2-I2V-A14B-HIGH_fp8_e4m3fn_scaled_KJ.safetensors |
FP8 |
15.0 GB |
Kijai repack, dest: diffusion_models/WanVideo/2_2 |
| WAN 2.2 I2V Low Noise 14B Kijai |
Wan2_2-I2V-A14B-LOW_fp8_e4m3fn_scaled_KJ.safetensors |
FP8 |
15.0 GB |
Kijai repack, dest: diffusion_models/WanVideo/2_2 |
| WAN 2.2 I2V High Noise LightX2V |
wan2.2_i2v_A14b_high_noise_lightx2v.safetensors |
FP16 |
28.6 GB |
LightX2V official |
| WAN 2.2 I2V Low Noise LightX2V |
wan2.2_i2v_A14b_low_noise_lightx2v.safetensors |
FP16 |
28.6 GB |
LightX2V official |
| WAN 2.2 I2V High Noise 14B |
wan2.2_i2v_high_noise_14B_Q8_0.gguf |
GGUF Q8 |
14.6 GB |
Quantized |
| WAN 2.2 I2V Low Noise 14B |
wan2.2_i2v_low_noise_14B_Q8_0.gguf |
GGUF Q8 |
14.6 GB |
Quantized |
VAE (3)
| Model |
File |
Precision |
Size |
| WAN 2.1 VAE |
wan_2.1_vae.safetensors |
FP16 |
0.5 GB |
| WAN 2.1 VAE |
Wan2_1_VAE_fp32.safetensors |
FP32 |
0.5 GB |
| WAN 2.1 VAE |
Wan2_1_VAE_bf16.safetensors |
BF16 |
0.2 GB |
Text Encoders (4)
| Model |
File |
Precision |
Size |
| UMT5-XXL |
umt5_xxl_fp8_e4m3fn_scaled.safetensors |
FP8 |
4.8 GB |
| UMT5-XXL |
wan21UMT5XxlFP32_fp32.safetensors |
FP32 |
11.0 GB |
| UMT5-XXL |
umt5-xxl-enc-bf16.safetensors |
BF16 |
11.4 GB |
| UMT5-XXL |
umt5_xxl_fp16.safetensors |
FP16 |
9.5 GB |
CLIP Vision (1)
| Model |
File |
Size |
| CLIP Vision H |
clip_vision_h.safetensors |
1.7 GB |
LoRAs (8)
| Model |
File |
Size |
Pair |
| LightX2V 4-step High |
wan2.2_i2v_lightx2v_4steps_lora_v1_high_noise.safetensors |
1.1 GB |
-- |
| LightX2V 4-step Low |
wan2.2_i2v_lightx2v_4steps_lora_v1_low_noise.safetensors |
1.1 GB |
-- |
| LightX2V CFG+Step Distill (Kijai) |
lightx2v_I2V_14B_480p_cfg_step_distill_rank64_bf16.safetensors |
0.7 GB |
-- |
| Seko-V1 High |
seko_v1_i2v_high_noise_model.safetensors |
1.2 GB |
-- |
| Seko-V1 Low |
seko_v1_i2v_low_noise_model.safetensors |
1.2 GB |
-- |
| LightX2V Unified |
Wan21_I2V_14B_lightx2v_cfg_step_distill_lora_rank64.safetensors |
0.7 GB |
lightx2v-unified (both) |
| SVI v2 PRO High |
SVI_v2_PRO_Wan2.2-I2V-A14B_HIGH_lora_rank_128_fp16.safetensors |
2.3 GB |
svi-v2-pro (high) |
| SVI v2 PRO Low |
SVI_v2_PRO_Wan2.2-I2V-A14B_LOW_lora_rank_128_fp16.safetensors |
2.3 GB |
svi-v2-pro (low) |
All WAN LoRAs have base_model: "wan-i2v-14b". See LoRA Pairing for details on dual-pass sampling.
Flux (15 models)
Black Forest Labs' Flux architecture for high-quality image generation.
Checkpoints (2)
| Model |
File |
Precision |
Size |
| FLUX.1-dev FP8 |
flux1-dev-fp8.safetensors |
FP8 |
11.6 GB |
| FLUX.1-schnell FP8 |
flux1-schnell-fp8.safetensors |
FP8 |
11.6 GB |
Diffusion Models (6)
| Model |
File |
Precision |
Size |
Notes |
| FLUX.1-dev |
flux1-dev.safetensors |
FP16 |
23.8 GB |
|
| FLUX.1-schnell |
flux1-schnell.safetensors |
FP16 |
23.8 GB |
Fast generation |
| FLUX.1-Fill-dev |
flux1-fill-dev.safetensors |
FP16 |
23.8 GB |
Inpainting |
| FLUX.1-Redux-dev |
flux1-redux-dev.safetensors |
FP16 |
0.2 GB |
Style transfer |
| FLUX.1-Depth-dev |
flux1-depth-dev.safetensors |
FP16 |
23.8 GB |
Depth ControlNet |
| FLUX.1-Canny-dev |
flux1-canny-dev.safetensors |
FP16 |
23.8 GB |
Canny ControlNet |
VAE (1)
| Model |
File |
Size |
| FLUX.1 VAE |
ae.safetensors |
0.3 GB |
Text Encoders (3)
| Model |
File |
Precision |
Size |
| CLIP-L |
clip_l.safetensors |
-- |
0.2 GB |
| T5-XXL |
t5xxl_fp16.safetensors |
FP16 |
9.5 GB |
| T5-XXL |
t5xxl_fp8_e4m3fn.safetensors |
FP8 |
4.8 GB |
CLIP Vision (1)
| Model |
File |
Size |
| SigCLIP Vision 384 |
sigclip_vision_patch14_384.safetensors |
0.9 GB |
LoRAs (2)
| Model |
File |
Size |
| FLUX.1-Depth-dev LoRA |
flux1-depth-dev-lora.safetensors |
0.4 GB |
| FLUX.1-Canny-dev LoRA |
flux1-canny-dev-lora.safetensors |
0.4 GB |
HunyuanVideo (12 models)
Tencent's open-source video generation model. Supports both text-to-video and image-to-video.
Diffusion Models (8)
| Model |
File |
Precision |
Size |
Notes |
| HunyuanVideo T2V 720p |
hunyuan_video_720_cfgdistill_fp8_e4m3fn.safetensors |
FP8 |
12.3 GB |
|
| HunyuanVideo T2V 720p |
hunyuan_video_720_cfgdistill_bf16.safetensors |
BF16 |
23.9 GB |
|
| HunyuanVideo I2V 720p |
hunyuan_video_I2V_720_fixed_fp8_e4m3fn.safetensors |
FP8 |
12.3 GB |
|
| HunyuanVideo I2V 720p |
hunyuan_video_I2V_720_fixed_bf16.safetensors |
BF16 |
23.9 GB |
|
| HunyuanVideo FramePack I2V |
FramePackI2V_HY_fp8_e4m3fn.safetensors |
FP8 |
15.2 GB |
FramePack architecture |
| HunyuanVideo Custom 720p |
hunyuan_video_custom_720p_fp8_scaled.safetensors |
FP8 Scaled |
12.3 GB |
|
| HunyuanVideo I2V |
hunyuan_video_I2V-Q4_K_S.gguf |
GGUF Q4 |
7.2 GB |
Quantized |
| HunyuanVideo I2V |
hunyuan_video_I2V-Q8_0.gguf |
GGUF Q8 |
13.0 GB |
Quantized |
VAE (2)
| Model |
File |
Precision |
Size |
| HunyuanVideo VAE |
hunyuan_video_vae_bf16.safetensors |
BF16 |
0.5 GB |
| HunyuanVideo VAE |
hunyuan_video_vae_fp32.safetensors |
FP32 |
0.9 GB |
Text Encoders (2)
| Model |
File |
Precision |
Size |
| LLaVA-LLaMA3 |
llava_llama3_fp8_scaled.safetensors |
FP8 |
8.5 GB |
| LLaVA-LLaMA3 |
llava_llama3_fp16.safetensors |
FP16 |
15.0 GB |
CogVideoX (7 models)
Zhipu's open-source video generation model, available in 5B parameter variants.
Diffusion Models (6)
| Model |
File |
Precision |
Size |
Notes |
| CogVideoX 1.0 5B I2V |
CogVideoX_1_0_5b_I2V_bf16.safetensors |
BF16 |
10.5 GB |
|
| CogVideoX 1.5 5B I2V |
CogVideoX_1_5_5b_I2V_bf16.safetensors |
BF16 |
10.4 GB |
|
| CogVideoX 1.5 5B T2V |
CogVideoX_1_5_5b_T2V_bf16.safetensors |
BF16 |
10.4 GB |
Text-to-video |
| CogVideoX Fun 1.1 5B Control |
CogVideoX_Fun_1_1_5b_Control_fp8_e4m3fn.safetensors |
FP8 |
5.2 GB |
ControlNet variant |
| CogVideoX 5B I2V |
CogVideoX_5b_I2V_GGUF_Q4_0.safetensors |
GGUF Q4 |
3.3 GB |
Quantized |
| CogVideoX 1.5 5B I2V |
CogVideoX_5b_1_5_I2V_GGUF_Q4_0.safetensors |
GGUF Q4 |
4.1 GB |
Quantized |
VAE (1)
| Model |
File |
Precision |
Size |
| CogVideoX VAE |
cogvideox_vae_bf16.safetensors |
BF16 |
0.4 GB |
AnimateDiff (16 models)
AnimateDiff motion modules and camera LoRAs for animating Stable Diffusion generations.
Motion Modules (7)
| Model |
File |
Base |
Size |
Notes |
| AnimateDiff v3 |
v3_sd15_mm.ckpt |
SD 1.5 |
1.6 GB |
Latest |
| AnimateDiff v2 |
mm_sd_v15_v2.ckpt |
SD 1.5 |
1.7 GB |
|
| AnimateDiff v1.5 |
mm_sd_v15.ckpt |
SD 1.5 |
1.6 GB |
|
| AnimateDiff v1.4 |
mm_sd_v14.ckpt |
SD 1.5 |
1.6 GB |
|
| AnimateDiff Lightning 4-step |
animatediff_lightning_4step_comfyui.safetensors |
SD 1.5 |
0.9 GB |
Fast |
| AnimateDiff Lightning 8-step |
animatediff_lightning_8step_comfyui.safetensors |
SD 1.5 |
0.9 GB |
Fast |
| AnimateDiff SDXL v1.0 Beta |
mm_sdxl_v10_beta.ckpt |
SDXL 1.0 |
0.9 GB |
SDXL support |
Adapter (1)
| Model |
File |
Size |
| AnimateDiff v3 Adapter |
v3_sd15_adapter.ckpt |
0.1 GB |
Camera LoRAs (6)
| Model |
File |
Size |
| Camera ZoomIn |
v2_lora_ZoomIn.ckpt |
0.1 GB |
| Camera ZoomOut |
v2_lora_ZoomOut.ckpt |
0.1 GB |
| Camera PanLeft |
v2_lora_PanLeft.ckpt |
0.1 GB |
| Camera PanRight |
v2_lora_PanRight.ckpt |
0.1 GB |
| Camera TiltUp |
v2_lora_TiltUp.ckpt |
0.1 GB |
| Camera TiltDown |
v2_lora_TiltDown.ckpt |
0.1 GB |
SparseCtrl (2)
| Model |
File |
Size |
| SparseCtrl RGB |
v3_sd15_sparsectrl_rgb.ckpt |
1.9 GB |
| SparseCtrl Scribble |
v3_sd15_sparsectrl_scribble.ckpt |
1.9 GB |
LTX Video (8 models)
Lightricks' video generation models, from the 2B v0.9 series through the 19B LTX-2 and 22B LTX-2.3.
Diffusion Models (6)
| Model |
File |
Precision |
Size |
Notes |
| LTX-Video 2B v0.9.1 |
ltx-video-2b-v0.9.1.safetensors |
BF16 |
5.3 GB |
|
| LTX-Video 2B v0.9.5 |
ltx-video-2b-v0.9.5.safetensors |
BF16 |
5.9 GB |
T2V + I2V |
| LTX-2 19B Dev |
ltx-2-19b-dev-fp8.safetensors |
FP8 |
25.2 GB |
|
| LTX-2 19B Distilled |
ltx-2-19b-distilled-fp8.safetensors |
FP8 |
25.2 GB |
Step-distilled |
| LTX-2.3 22B Dev |
ltx-2.3-22b-dev-fp8.safetensors |
FP8 |
27.1 GB |
|
| LTX-2.3 22B Distilled |
ltx-2.3-22b-distilled-fp8.safetensors |
FP8 |
27.5 GB |
Step-distilled |
Upsamplers (2)
| Model |
File |
Size |
Notes |
| LTX-2.3 Spatial Upscaler 2x |
ltx-2.3-spatial-upscaler-x2-1.0.safetensors |
0.9 GB |
Resolution upscale |
| LTX-2.3 Temporal Upscaler 2x |
ltx-2.3-temporal-upscaler-x2-1.0.safetensors |
0.2 GB |
Frame rate upscale |
SDXL (9 models)
Stability AI's SDXL 1.0 architecture for high-resolution image generation.
Diffusion Models (2)
| Model |
File |
Precision |
Size |
Notes |
| SDXL Base 1.0 |
sd_xl_base_1.0.safetensors |
FP16 |
6.9 GB |
|
| SDXL Turbo 1.0 |
sd_xl_turbo_1.0.safetensors |
FP16 |
6.9 GB |
Fast (1-4 steps) |
VAE (1)
| Model |
File |
Size |
| SDXL VAE (FP16 Fix) |
sdxl_vae.safetensors |
0.3 GB |
ControlNet (4)
| Model |
File |
Size |
Notes |
| ControlNet Union ProMax |
diffusion_pytorch_model_promax.safetensors |
2.3 GB |
12 modes in one |
| ControlNet Depth Mid |
diffusers_xl_depth_mid.safetensors |
0.5 GB |
|
| ControlNet Canny Mid |
diffusers_xl_canny_mid.safetensors |
0.5 GB |
|
| ControlNet OpenPose |
diffusion_pytorch_model.safetensors |
2.3 GB |
|
IP-Adapter (4)
| Model |
File |
Size |
Notes |
| IP-Adapter Plus |
ip-adapter-plus_sdxl_vit-h.safetensors |
0.8 GB |
Style transfer |
| IP-Adapter Plus Face |
ip-adapter-plus-face_sdxl_vit-h.safetensors |
0.8 GB |
Face transfer |
| IP-Adapter FaceID Plus V2 |
ip-adapter-faceid-plusv2_sdxl.bin |
0.8 GB |
FaceID |
| IP-Adapter |
ip-adapter_sdxl_vit-h.safetensors |
0.7 GB |
Base IP-Adapter |
Note: The FaceID Plus V2 model also has a companion LoRA (ip-adapter-faceid-plusv2_sdxl_lora.safetensors, 0.3 GB) in the LoRAs category.
SD 1.5 (11 models)
Stability AI's Stable Diffusion 1.5, the most widely used base model for image generation.
Checkpoint (1)
| Model |
File |
Precision |
Size |
| Stable Diffusion v1.5 |
v1-5-pruned-emaonly.safetensors |
EMA FP16 |
4.0 GB |
VAE (1)
| Model |
File |
Size |
| VAE ft-mse-840000 |
vae-ft-mse-840000-ema-pruned.safetensors |
0.3 GB |
Text Encoder (1)
| Model |
File |
Size |
| CLIP ViT-L/14 |
clip-vit-large-patch14.safetensors |
0.6 GB |
ControlNet (5)
| Model |
File |
Size |
| ControlNet v1.1 Depth |
control_v11f1p_sd15_depth_fp16.safetensors |
1.4 GB |
| ControlNet v1.1 Canny |
control_v11f1p_sd15_canny_fp16.safetensors |
1.4 GB |
| ControlNet v1.1 OpenPose |
control_v11f1p_sd15_openpose_fp16.safetensors |
1.4 GB |
| ControlNet v1.1 Lineart |
control_v11p_sd15_lineart.pth |
1.4 GB |
| ControlNet v1.1 Tile |
control_v11f1e_sd15_tile.pth |
1.4 GB |
IP-Adapter (6)
| Model |
File |
Size |
Notes |
| IP-Adapter Plus |
ip-adapter-plus_sd15.safetensors |
0.1 GB |
Style transfer |
| IP-Adapter Plus Face |
ip-adapter-plus-face_sd15.safetensors |
0.1 GB |
|
| IP-Adapter Full Face |
ip-adapter-full-face_sd15.safetensors |
< 0.1 GB |
|
| IP-Adapter |
ip-adapter_sd15.safetensors |
< 0.1 GB |
Base |
| IP-Adapter Light v1.1 |
ip-adapter_sd15_light_v11.bin |
< 0.1 GB |
Lightweight |
| IP-Adapter ViT-G |
ip-adapter_sd15_vit-G.safetensors |
< 0.1 GB |
|
Embeddings (2) -- from CivitAI
| Model |
File |
Size |
| EasyNegative |
easynegative.safetensors |
< 0.1 GB |
| veryBadImageNegative v1.3 |
verybadimagenegative_v1.3.pt |
< 0.1 GB |
These are the only two models in the entire catalog sourced from CivitAI rather than HuggingFace.
Shared Components
CLIP Vision (1)
Shared across SD 1.5 and SDXL IP-Adapter workflows:
| Model |
File |
Size |
| CLIP Vision H ViT-H/14 |
CLIP-ViT-H-14-laion2B-s32B-b79K.safetensors |
2.4 GB |
Segmentation (4 models)
SAM2.1 (Segment Anything Model 2.1) from Meta, in four sizes. All FP16. Used for masking and region selection.
| Model |
File |
Size |
| SAM 2.1 Hiera Large |
sam2.1_hiera_large-fp16.safetensors |
0.4 GB |
| SAM 2.1 Hiera Base Plus |
sam2.1_hiera_base_plus-fp16.safetensors |
0.2 GB |
| SAM 2.1 Hiera Small |
sam2.1_hiera_small-fp16.safetensors |
0.1 GB |
| SAM 2.1 Hiera Tiny |
sam2.1_hiera_tiny-fp16.safetensors |
0.1 GB |
Face Swap (1 model)
| Model |
File |
Size |
Format |
| InsightFace inswapper_128 |
inswapper_128.onnx |
0.5 GB |
ONNX |
Upscale (6 models)
Four architecture-agnostic upscalers (empty base_model) plus two LTX-specific upsamplers:
| Model |
File |
Size |
Type |
| 4x Foolhardy Remacri |
4x_foolhardy_Remacri.pth |
0.1 GB |
General |
| 4x UltraSharp |
4x-UltraSharp.pth |
0.1 GB |
Sharpening |
| RealESRGAN x4 |
RealESRGAN_x4plus.pth |
0.1 GB |
General |
| RealESRGAN x4 Anime |
RealESRGAN_x4plus_anime_6B.pth |
0.1 GB |
Anime |
| LTX-2.3 Spatial Upscaler 2x |
ltx-2.3-spatial-upscaler-x2-1.0.safetensors |
0.9 GB |
LTX spatial |
| LTX-2.3 Temporal Upscaler 2x |
ltx-2.3-temporal-upscaler-x2-1.0.safetensors |
0.2 GB |
LTX temporal |
Detection (1 model)
| Model |
File |
Size |
| Face YOLOv8m |
face_yolov8m.pt |
0.1 GB |
Ultralytics YOLOv8 medium model, trained for face detection. Used by face-fix and FaceID workflows.
Summary by Category
| Category |
Count |
Typical Size Range |
| Diffusion Models |
41 |
0.2 -- 28.6 GB |
| AnimateDiff |
16 |
0.1 -- 1.9 GB |
| LoRAs |
11 |
0.3 -- 2.3 GB |
| IP-Adapter |
10 |
< 0.1 -- 0.8 GB |
| Text Encoders |
10 |
0.2 -- 15.0 GB |
| VAE |
9 |
0.2 -- 0.9 GB |
| ControlNet |
9 |
0.5 -- 2.3 GB |
| Upscalers |
6 |
0.1 -- 0.9 GB |
| Segmentation |
4 |
0.1 -- 0.4 GB |
| Checkpoints |
3 |
4.0 -- 11.6 GB |
| CLIP Vision |
3 |
0.9 -- 2.4 GB |
| Embeddings |
2 |
< 0.1 GB |
| Detectors |
1 |
0.1 GB |
| Face Swap |
1 |
0.5 GB |
| Total |
126 |
|