ltx-2.3-22b-distilled-1.1_transformer_only_fp8_scaled.safetensors
LTX 2.3 Distilled 1.1 FP8 (Kijai)
FP8 quantized v1.1 distilled by Kijai. Best for 16GB VRAM. 8 steps, CFG=1.
Released 2026-04-13 · Source: Kijai/LTX2.3_comfy (HuggingFace) — v1.1 release improved fast-motion stability and character consistency over v1.0.
Download ltx-2.3-22b-distilled-1.1_transformer_only_fp8_scaled.safetensors
Direct HuggingFace download. 25.2 GB · Free.
No 16GB GPU? Try ltx-2.3-22b-distilled-1.1_transformer_only_fp8_scaled.safetensors online — free generation included
Skip the 25.2 GB download and ComfyUI setup. Generate a 6-second video using this exact model in your browser, ~30 seconds.
Technical details
FP8 scaled quantization stores transformer weights as 8-bit floats with per-channel scaling tables alongside. On hardware with native FP8 matmul (RTX 40-series Ada, H100, RTX 50-series Blackwell), this gives near-BF16 quality at roughly half the VRAM and a meaningful speedup; on older GPUs the math falls back to a slow path and you should use the MXFP8 variant instead.
'transformer_only' means this file contains only the DiT/transformer weights — not the VAE, not the text encoder. You will not be able to run a workflow with just this file; pair it with taeltx2_3.safetensors (VAE) and a Gemma 3 12B text encoder (FP4 mixed on 16 GB, FP8 scaled on 24 GB, BF16 on 32 GB+).
Distilled v1.1 inference settings are 8 steps with CFG=1 — roughly 4× faster than running the dev model with the standard 30-step sampler. The v1.1 weights are noticeably better than v1.0 distilled for fast camera motion and character consistency.
When to choose ltx-2.3-22b-distilled-1.1_transformer_only_fp8_scaled.safetensors
Default choice for 16 GB cards (RTX 4070, 4070 Ti SUPER, 4080, 4070 Ti) and the fastest path on 24 GB (RTX 4090). You get the latest distilled quality at FP8 size, with hardware-accelerated matmul.
Pick the MXFP8 block-32 variant of the same file instead if you are on RTX 30-series — it stores the same weights in a format your GPU's tensor cores can handle without falling back to a slow emulation path.
Pick the BF16 transformer-only variant instead if you have 32 GB+ VRAM and want maximum quality, or if you plan to apply LoRAs that the FP8 path does not accept cleanly.
Will this run on my GPU?
Minimum: 16GB VRAM. Headroom up to: 24GB.
⚠ FP8 scaled matmul requires RTX 40-series or newer (Ada Lovelace architecture). RTX 30xx cannot run this format — use the MXFP8 block-32 or BF16 variant instead.
Recommendation: Best choice for 16GB VRAM. Latest v1.1 FP8 distilled. Requires RTX 40xx+ for fp8 matmuls.
How to use ltx-2.3-22b-distilled-1.1_transformer_only_fp8_scaled.safetensors
- Download the file from HuggingFace.
- Place it in ComfyUI/models/checkpoints/ inside your ComfyUI directory.
- Restart ComfyUI (or refresh the model list from the menu).
- Load a compatible workflow — see below.
Compatible official workflows:
- LTX-2.3_T2V_I2V_Single_Stage_Distilled_Full.json— T2V / I2V Single Stage Distilled
- LTX-2.3_T2V_I2V_Two_Stage_Distilled.json— T2V / I2V Two Stage Distilled
- LTX-2.3_ICLoRA_Union_Control_Distilled.json— ICLoRA Union Control Distilled
- LTX-2.3_ICLoRA_Motion_Track_Distilled.json— ICLoRA Motion Track Distilled
- LTX-2.3_ICLoRA_HDR_Distilled.json— ICLoRA HDR Distilled
Don't want to run this locally? Try ltx-2.3-22b-distilled-1.1_transformer_only_fp8_scaled.safetensors online with a free generation — no GPU, no install, ~30 seconds per clip.
ComfyUI says it can't find ltx-2.3-22b-distilled-1.1_transformer_only_fp8_scaled.safetensors?
Some published workflow JSONs reference this file under a custom subdirectory. If ComfyUI shows a "cannot find model" error and your workflow references one of these path-prefixed variants:
- ltxvideo\v2\ltx-2.3-22b-distilled-1.1_transformer_only_fp8_scaled.safetensors
- ltx23\ltx-2.3-22b-distilled-1.1_transformer_only_fp8_scaled.safetensors
- diffusion_models/ltx-2.3-22b-distilled-1.1_transformer_only_fp8_scaled.safetensors
The prefix before the slash or backslash is a subdirectory the workflow author used. The actual file is the same ltx-2.3-22b-distilled-1.1_transformer_only_fp8_scaled.safetensors — you have two fixes:
- Create the matching subdirectory inside ComfyUI/models/checkpoints/ and place the file there. Example: if the workflow references ltxvideo\v2\ltx-2.3-22b-distilled-1.1_transformer_only_fp8_scaled.safetensors, create the corresponding subfolder under ComfyUI/models/checkpoints/ and put ltx-2.3-22b-distilled-1.1_transformer_only_fp8_scaled.safetensors inside it.
- Or open the workflow JSON in a text editor and replace the prefixed string with just ltx-2.3-22b-distilled-1.1_transformer_only_fp8_scaled.safetensors. ComfyUI then resolves it directly from ComfyUI/models/checkpoints/.
On Windows the separator is \, on macOS/Linux it is / — they refer to the same nested folder regardless of platform.
Common issues
RuntimeError: fp8 matmul not supported / 'unsupported dtype' on RTX 3090 / RTX 3060▼
RTX 30-series GPUs lack native FP8 tensor cores. The runtime cannot dispatch the FP8 path. Fix: Switch to ltx-2.3-22b-distilled-1.1_transformer_only_mxfp8_block32.safetensors. Same weights, MXFP8 format runs on Ampere/Ada tensor cores. Same VRAM footprint.
OOM at first inference on 16 GB even though file is only 25 GB▼
The full pipeline (transformer + Gemma text encoder + VAE + activations) exceeds 16 GB if you use the BF16 Gemma. Activations can spike to several GB at higher resolutions. Fix: Use gemma_3_12B_it_fp4_mixed.safetensors as the text encoder. Enable model offload in ComfyUI's manager if you still OOM at 768p. Drop resolution to 576p as a last resort.
First generation is very slow, later ones are fast▼
torch.compile warmup on first call. The model is being compiled for your specific GPU and input shape. Fix: This is expected. Keep ComfyUI running and reuse the loaded model; subsequent generations skip compilation. If you change resolution or aspect, you'll get one more slow warmup.
Workflow references '_transformer_only_fp8_scaled.safetensors' but file is named slightly differently▼
Older workflow JSONs reference the v1.0 filename (without '-1.1' in the name) — they need to be updated to use v1.1. Fix: Open the workflow JSON in a text editor and replace the v1.0 filename string with 'ltx-2.3-22b-distilled-1.1_transformer_only_fp8_scaled.safetensors'. Or edit the Load node in ComfyUI and reselect the v1.1 file from the dropdown.
ComfyUI doesn't see the file after I downloaded it▼
Make sure the file is in ComfyUI/models/checkpoints/ (not a subfolder). Restart ComfyUI fully — the menu refresh sometimes misses new files. Filename must match exactly: ltx-2.3-22b-distilled-1.1_transformer_only_fp8_scaled.safetensors.
I get a CUDA error mentioning fp8 / scaled / matmul▼
FP8 scaled matmuls require an RTX 40-series GPU or newer (Ada Lovelace architecture). RTX 30-series and older cannot run FP8 weights at native precision. Use the BF16 variant instead, or the MXFP8 block-32 alternative.
CUDA out of memory error when loading the model▼
ltx-2.3-22b-distilled-1.1_transformer_only_fp8_scaled.safetensors needs ~16GB VRAM minimum. If you're hitting OOM: • Enable Sequential Offloading in ComfyUI settings • Lower the resolution (768×512 instead of 1280×704) — both dimensions must be divisible by 32 • Reduce frame count (65 frames instead of 161) — must be 8n+1 • Use a smaller variant — see Related models below.
Get notified when LTX 2.3 Distilled 1.1 FP8 (Kijai) updates
Occasional updates on what's new in LTX 2.3 — new FP8 quants, LoRAs, IC-LoRA releases — with our hands-on verdict on whether they're worth re-downloading. No fixed cadence.
No spam. Sent occasionally when there's real news. Unsubscribe in one click.
Related models
ltx-2.3-22b-distilled-1.1_transformer_only_mxfp8_block32.safetensors
ltx-2.3-22b-distilled_transformer_only_fp8_input_scaled_v3.safetensors
ltx-2.3-22b-dev_transformer_only_fp8_input_scaled.safetensors
ltx-2.3-22b-dev_transformer_only_fp8_scaled.safetensors
ltx-2.3-22b-dev_transformer_only_mxfp8_block32.safetensors
ltx-2.3-22b-distilled_transformer_only_fp8_input_scaled.safetensors