HunyuanVideo Text-to-Video Generation in ComfyUI
Comprehensive guide to using Tencent's HunyuanVideo model in ComfyUI for high-quality text-to-video generation with detailed workflow examples.
By ltx workflow
Editor's Note: This showcase demonstrates HunyuanVideo's capabilities in ComfyUI, providing a complete workflow for text-to-video generation.
HunyuanVideo Text-to-Video Workflow Guide and Examples
This tutorial provides a comprehensive guide on using Tencent's Hunyuan Video model in ComfyUI for text-to-video generation.
1. Install and Update ComfyUI to Latest Version
You'll need to install and update ComfyUI to the latest version to access the 'EmptyHunyuanLatentVideo' node.
2. Model Download and Installation
HunyuanVideo requires the following model files:
2.1 Main Model File
Download from HunyuanVideo Main Model Download Page
2.2 Text Encoder Files
Download from HunyuanVideo Text Encoder Download Page
2.3 VAE Model File
Download from HunyuanVideo VAE Download Page
Model Directory Structure Reference
ComfyUI/
├── models/
│ ├── diffusion_models/
│ │ └── hunyuan_video_t2v_720p_bf16.safetensors # Main model file
│ ├── text_encoders/
│ │ ├── clip_l.safetensors # CLIP text encoder
│ │ └── llava_llama3_fp8_scaled.safetensors # LLaVA text encoder
│ └── vae/
│ └── hunyuan_video_vae_bf16.safetensors # VAE model file
3. Workflow File Download
Workflow file source: HunyuanVideo Workflow Download
Basic Video Generation Workflow
HunyuanVideo supports multiple resolution settings optimized for different use cases.
4. Workflow Node Explanation
4.1 Model Loading Nodes
UNETLoader
- Purpose: Load the main model file
- Parameters:
- Model: hunyuan_video_t2v_720p_bf16.safetensors
- Weight Type: default (can choose fp8 type if memory is insufficient)
DualCLIPLoader
- Purpose: Load text encoder models
- Parameters:
- CLIP 1: clip_l.safetensors
- CLIP 2: llava_llama3_fp8_scaled.safetensors
- Text Encoder: hunyuan_video
VAELoader
- Purpose: Load VAE model for video decoding
4.2 Generation Parameters
HunyuanVideo provides fine-grained control over:
- Resolution settings (720p optimized)
- Frame count and duration
- Sampling steps and CFG scale
- Prompt guidance strength
Key Features
High-Quality Output HunyuanVideo produces broadcast-quality video with smooth motion and consistent styling.
Flexible Resolution Supports various aspect ratios and resolutions optimized for different platforms.
Advanced Text Understanding Leverages dual text encoders (CLIP + LLaVA) for superior prompt comprehension.
Efficient Processing Optimized architecture enables faster generation compared to earlier models.