top of page
LTX 2.3 22B GGUF(16 GB VRAM): One-Click ComfyUI Installer

LTX 2.3 22B GGUF(16 GB VRAM): One-Click ComfyUI Installer

The latest LTX 2.3 22B model is here — bringing major improvements over LTX-2, especially in video stability, motion smoothness, and frame coherence. This release refines text-to-video and image-to-video generation with enhanced precision and significantly better temporal consistency.

 

One of the biggest upgrades comes from a new three-sampler ComfyUI workflow I discovered on reddit — instead of just one. Two of the samplers perform an advanced 6x upscale across phases, dramatically improving detail and output quality while keeping generation times efficient. I made a few adjustments to the workflow so its runs the GGUF models instead of the diffusion models so it can actually be run on devices with less then 24 GB VRAM.

 

To make things even simpler, I've built a one-click Windows installer that automatically downloads, installs, and configures everything locally. This means no complicated setup — just run the installer, and you’re ready to generate cinematic AI videos directly from your PC.

 

Preloaded Models Within the Installer (Low VRAM)

  • ltx-2.3_text_projection_bf16.safetensors (ComfyUI\models\clip)
    Hugging Face Link

  • gemma-3-12b-it-UD-Q4_K_XL.gguf (ComfyUI\models\clip)
    Hugging Face Link

  • LTX23_audio_vae_bf16.safetensors (ComfyUI\models\vae)
    Hugging Face Link

  • LTX23_video_vae_bf16.safetensors (ComfyUI\models\vae)
    Hugging Face Link

  • ltx-2.3-22b-dev-Q3_K_S.gguf (ComfyUI\models\unet)
    Hugging Face Link

  • ltx-2.3-spatial-upscaler-x2-1.0.safetensors Upscale model (ComfyUI\models\latent_upscale_models)
    Hugging Face Link

  • ltx-2.3-22b-distilled-lora-dynamic_fro09_avg_rank_105_bf16.safetensors LoRA (ComfyUI\models\loras)
    Hugging Face Link

 

The standard LTX 2.3 22B checkpoint models (BF16, FP8) aren’t bundled within the installer, but you can easily grab them directly from Kijai’s official Hugging Face repository:
LTX-2 19B Diffusion Models - https://huggingface.co/Kijai/LTX2.3_comfy/tree/main/diffusion_models

 

Speed

You can generate 10-second 1280×896 text-to-video clips in around 10–15 minutes using an RTX A6000 (16GB VRAM). For image-to-video, expect similar results in roughly 15 minutes.

 

The workflow intelligently adapts to your system’s specs, allowing faster frame synthesis and 24fps+ output on high-end GPUs. The first pass begins at lower resolution and automatically upscales 6x in the final two sampling phases for ultra-detailed final renders.

 

System Requirements

  • NVIDIA RTX 30XX / 40XX / 50XX GPU (FP16 supported)

  • CUDA-compatible GPU (minimum 16GB VRAM, 24GB+ recommended)

  • Windows OS

  • Minimum 50GB free storage

 

What’s Included

  • Portable ComfyUI Windows Installer (pre-configured for LTX 2.3 text-to-video and image-to-video generation)

  • Automated model downloads and ComfyUI nodes setup

  • Optimized long-sequence video workflows for flexible generation

  • Designed for both beginners and advanced AI video creators

 

Usage Notes

  • Works seamlessly with LTX-2.3 22B GGUF (optimized for low VRAM).

  • Use rich, detailed text prompts to achieve more coherent motion and storytelling.

  • For image-to-video, start with high-resolution input images for best fidelity.

  • To tweak or expand the workflow, open the subgraph prompt node and use the upper-right arrow to adjust settings.

    $4.00Price
    Quantity
      bottom of page