Showcase

HunyuanVideo Text-to-Video Generation in ComfyUI

Comprehensive guide to using Tencent's HunyuanVideo model in ComfyUI for high-quality text-to-video generation with detailed workflow examples.

By ltx workflow

Editor's Note: This showcase demonstrates HunyuanVideo's capabilities in ComfyUI, providing a complete workflow for text-to-video generation.

HunyuanVideo Text-to-Video Workflow Guide and Examples

This tutorial provides a comprehensive guide on using Tencent's Hunyuan Video model in ComfyUI for text-to-video generation.

1. Install and Update ComfyUI to Latest Version

You'll need to install and update ComfyUI to the latest version to access the 'EmptyHunyuanLatentVideo' node.

2. Model Download and Installation

HunyuanVideo requires the following model files:

2.1 Main Model File

Download from HunyuanVideo Main Model Download Page

2.2 Text Encoder Files

Download from HunyuanVideo Text Encoder Download Page

2.3 VAE Model File

Download from HunyuanVideo VAE Download Page

Model Directory Structure Reference

ComfyUI/
├── models/
│   ├── diffusion_models/
│   │   └── hunyuan_video_t2v_720p_bf16.safetensors  # Main model file
│   ├── text_encoders/
│   │   ├── clip_l.safetensors                       # CLIP text encoder
│   │   └── llava_llama3_fp8_scaled.safetensors      # LLaVA text encoder
│   └── vae/
│       └── hunyuan_video_vae_bf16.safetensors       # VAE model file

3. Workflow File Download

Workflow file source: HunyuanVideo Workflow Download

Basic Video Generation Workflow

HunyuanVideo supports multiple resolution settings optimized for different use cases.

4. Workflow Node Explanation

4.1 Model Loading Nodes

UNETLoader

Purpose: Load the main model file
Parameters:
- Model: hunyuan_video_t2v_720p_bf16.safetensors
- Weight Type: default (can choose fp8 type if memory is insufficient)

DualCLIPLoader

Purpose: Load text encoder models
Parameters:
- CLIP 1: clip_l.safetensors
- CLIP 2: llava_llama3_fp8_scaled.safetensors
- Text Encoder: hunyuan_video

VAELoader

Purpose: Load VAE model for video decoding

4.2 Generation Parameters

HunyuanVideo provides fine-grained control over:

Resolution settings (720p optimized)
Frame count and duration
Sampling steps and CFG scale
Prompt guidance strength

Key Features

High-Quality Output HunyuanVideo produces broadcast-quality video with smooth motion and consistent styling.

Flexible Resolution Supports various aspect ratios and resolutions optimized for different platforms.

Advanced Text Understanding Leverages dual text encoders (CLIP + LLaVA) for superior prompt comprehension.

Efficient Processing Optimized architecture enables faster generation compared to earlier models.

Sources

#hunyuan-video#comfyui#text-to-video#tencent#workflow