LTX 2.3 22B GGUF(16 GB VRAM): One-Click ComfyUI Installer

Name: LTX 2.3 22B GGUF(16 GB VRAM): One-Click ComfyUI Installer
Brand: The Local Lab
SKU: 63

The latest LTX 2.3 22B model is here — bringing major improvements over LTX-2, especially in video stability, motion smoothness, and frame coherence. This release refines text-to-video and image-to-video generation with enhanced precision and significantly better temporal consistency.

One of the biggest upgrades comes from a new three-sampler ComfyUI workflow I discovered on reddit — instead of just one. Two of the samplers perform an advanced 6x upscale across phases, dramatically improving detail and output quality while keeping generation times efficient. I made a few adjustments to the workflow so its runs the GGUF models instead of the diffusion models so it can actually be run on devices with less then 24 GB VRAM.

To make things even simpler, I've built a one-click Windows installer that automatically downloads, installs, and configures everything locally. This means no complicated setup — just run the installer, and you’re ready to generate cinematic AI videos directly from your PC.

Preloaded Models Within the Installer (Low VRAM)

ltx-2.3_text_projection_bf16.safetensors (ComfyUI\models\clip)
Hugging Face Link
gemma-3-12b-it-UD-Q4_K_XL.gguf (ComfyUI\models\clip)
Hugging Face Link
LTX23_audio_vae_bf16.safetensors (ComfyUI\models\vae)
Hugging Face Link
LTX23_video_vae_bf16.safetensors (ComfyUI\models\vae)
Hugging Face Link
ltx-2.3-22b-dev-Q3_K_S.gguf (ComfyUI\models\unet)
Hugging Face Link
ltx-2.3-spatial-upscaler-x2-1.0.safetensors Upscale model (ComfyUI\models\latent_upscale_models)
Hugging Face Link
ltx-2.3-22b-distilled-lora-dynamic_fro09_avg_rank_105_bf16.safetensors LoRA (ComfyUI\models\loras)
Hugging Face Link

The standard LTX 2.3 22B checkpoint models (BF16, FP8) aren’t bundled within the installer, but you can easily grab them directly from Kijai’s official Hugging Face repository:
LTX-2 19B Diffusion Models - https://huggingface.co/Kijai/LTX2.3_comfy/tree/main/diffusion_models

Speed

You can generate 10-second 1280×896 text-to-video clips in around 10–15 minutes using an RTX A6000 (16GB VRAM). For image-to-video, expect similar results in roughly 15 minutes.

The workflow intelligently adapts to your system’s specs, allowing faster frame synthesis and 24fps+ output on high-end GPUs. The first pass begins at lower resolution and automatically upscales 6x in the final two sampling phases for ultra-detailed final renders.

System Requirements

NVIDIA RTX 30XX / 40XX / 50XX GPU (FP16 supported)
CUDA-compatible GPU (minimum 16GB VRAM, 24GB+ recommended)
Windows OS
Minimum 50GB free storage

What’s Included

Portable ComfyUI Windows Installer (pre-configured for LTX 2.3 text-to-video and image-to-video generation)
Automated model downloads and ComfyUI nodes setup
Optimized long-sequence video workflows for flexible generation
Designed for both beginners and advanced AI video creators

Usage Notes

Works seamlessly with LTX-2.3 22B GGUF (optimized for low VRAM).
Use rich, detailed text prompts to achieve more coherent motion and storytelling.
For image-to-video, start with high-resolution input images for best fidelity.
To tweak or expand the workflow, open the subgraph prompt node and use the upper-right arrow to adjust settings.

$4.00

Quantity

The Local Lab Shop

LTX 2.3 22B GGUF(16 GB VRAM): One-Click ComfyUI Installer

The Local Lab Shop