Deep DiveApril 16, 2026

LTX 2.3 LoRA Training Deep Dive: Style, Motion & IC-LoRA Control

A practical deep dive into training custom LoRAs on LTX 2.3 using the official ltx-trainer. Covers dataset rules, rank settings, the three LoRA types (style, motion, IC-LoRA), and what breaks when upgrading from LTX 2.0.

By ltx workflow

LTX 2.3 LoRA Training Deep Dive: Style, Motion & IC-LoRA Control

Text prompts can only take you so far. When you need a specific motion pattern, a consistent visual style, or structural control over camera geometry, LoRA training is the answer. LTX 2.3 ships with an official trainer — and once you understand the constraints, it's quietly practical.

This guide covers what actually matters: dataset rules, the three LoRA types, baseline settings, and what breaks when upgrading from LTX 2.0.

The Official Trainer

The LTX-Video GitHub repository is organized as a monorepo with three packages:

ltx-core — model implementation
ltx-pipelines — generation workflows
ltx-trainer — LoRA and IC-LoRA fine-tuning

Two hard constraints apply to everything you train:

Resolution: width and height must be divisible by 32
Frame count: must follow the 8n+1 rule — 1, 9, 17, 25, 33... frames

These aren't soft guidelines. The trainer will error or silently pad your data if you ignore them. Resize and trim your dataset before you start.

Three LoRA Types

Style LoRAs — appearance, texture, color

Style LoRAs teach LTX 2.3 a visual aesthetic: color grading, lighting mood, texture treatment. Image-only datasets work fine here — still frames are simpler to curate than video clips, and the model learns identity without needing to process motion.

Dataset size: 20–50 images is enough for most styles. For highly specific subjects (a particular product, a person's face), push to 80–120.

When to use: brand consistency, product photography style, artistic look development.

Motion LoRAs — movement, transformation

Motion LoRAs focus on how things move rather than how they look: camera dolly-ins, object rotation, transformation sequences. These require short coherent video clips, not stills.

Dataset: 15–30 second clips at consistent framing. Multi-scene segments hurt more than they help — the model needs to isolate the motion pattern, not learn scene variety.

Caveat: motion LoRA training is less settled than style. Expect more retries and more variable results. Start with style LoRAs to validate your pipeline before attempting motion.

IC-LoRAs — structural control

IC-LoRA is fundamentally different. Instead of teaching a new aesthetic or motion, it conditions generation on reference signals — depth maps, pose skeletons, canny edge detections. This enables video-to-video control on top of the text-to-video base.

Three control modes:

Canny — edge preservation, good for maintaining object outlines
Depth — camera geometry and spatial layout control
Pose — human motion transfer from reference video

IC-LoRA is the most powerful of the three types, but also the most demanding to set up correctly. The official IC-LoRA workflows are documented in the ComfyUI-LTXVideo repository.

Dataset Preparation

Frame count (8n+1 rule)

Batch-process your clips to a valid frame count before training. 17 frames (2×8+1) is a practical default for short clips. The trainer won't always tell you clearly when it's padding — verify your dataset dimensions upfront.

Resolution (32px rule)

Common safe resolutions:

512×512, 512×768, 768×512
768×432, 1024×576

Avoid 1080p or other non-divisible sizes. The trainer may pad silently, which wastes compute and introduces edge artifacts.

Video vs image datasets

Use case	Dataset type
Style / identity LoRA	Images only (simpler, faster)
Motion / effect LoRA	Short video clips (coherent framing)
IC-LoRA	Paired input+control signal videos

Start with images. Add video clips only when motion is the actual goal.

Baseline Training Settings

Rank

Rank 32 is the correct default for LTX 2.3. It gives enough capacity without making the LoRA too rigid. Rank 64 rarely helps unless your dataset is large (100+ diverse samples) — the extra capacity goes unused with small datasets.

Learning rate

Start at 1e-4. Drop to 5e-5 if you see oversaturation or style collapse in early checkpoints. Increase to 2e-4 only if the model isn't picking up the target style after 500+ steps.

Steps

Dataset size	Recommended steps
20–50 images	500–1000
50–120 images	1000–2000
Video clips (motion)	1500–3000

Save checkpoints every 200–300 steps and test inference at each — training longer isn't always better.

Upgrading from LTX 2.0: What Breaks

If you have existing LTX 2.0 LoRAs, they will not work with LTX 2.3. The VAE architecture changed, the latent space changed, and the model size jumped from ~10–14B to 22B parameters. There is no migration path — you need to retrain.

Other things that change when upgrading:

Seeds are not reproducible across versions. Same seed diverges after ~10–12 steps.
Prompts resolve more literally in 2.3 — positional hints ("left/right", "foreground") are followed more strictly. Art-y prompts that relied on LTX 2.0's looseness may need reworking.
Default contrast/saturation is higher — dial guidance down ~0.5–1.0 if your neutral presets look punchy.
Sampler step sweet spot shifted — from ~28–32 steps (LTX 2.0) to ~22–26 steps (LTX 2.3) for equivalent detail.

Applying LoRAs in ComfyUI

Once trained, apply your LoRA via the standard LoRA Loader node in ComfyUI. Recommended strength values:

Style LoRA: 0.6–0.8 (higher can oversaturate)
Motion LoRA: 0.5–0.7 (motion bleeds into unrelated content above 0.8)
IC-LoRA: 0.7–1.0 (structural control benefits from higher strength)

You can stack multiple LoRAs, but keep total combined strength below ~1.5 to avoid artifacts.

Quick Reference

Setting	Value
Resolution	Divisible by 32
Frame count	8n+1 (17, 25, 33...)
Default rank	32
Learning rate	1e-4
Style dataset	20–50 images
Motion dataset	Short coherent video clips
LTX 2.0 LoRA compatibility	None — retrain required

Download the LTX 2.3 checkpoints needed for training on our Models page, or load the ICLoRA workflow directly from the Workflows page.

Sources

#lora#training#ltx-2.3#comfyui#ic-lora#deep-dive

LTX 2.3 LoRA Training Deep Dive: Style, Motion & IC-LoRA Control

LTX 2.3 LoRA Training Deep Dive: Style, Motion & IC-LoRA Control

The Official Trainer

Three LoRA Types

Style LoRAs — appearance, texture, color

Motion LoRAs — movement, transformation

IC-LoRAs — structural control

Dataset Preparation

Frame count (8n+1 rule)

Resolution (32px rule)

Video vs image datasets

Baseline Training Settings

Rank

Learning rate

Steps

Upgrading from LTX 2.0: What Breaks

Applying LoRAs in ComfyUI

Quick Reference

Sources

Related Articles

Open-Source Video Models Now Production-Ready