AOpen

Wan 2.2

Alibaba (Tongyi Lab)

MoE architecture with 27B total params but only 14B active. Trained on 65% more images and 83% more video than 2.1. Outperforms leading closed-source models on Wan-Bench 2.0.

AOpen

LTX-2

Lightricks

The predecessor to 2.3. 19B params, native 4K + audio, 30 camera moves. Still excellent and widely deployed.

Pick Wan 2.2 if…

You want cinematic style control, speech-to-video, or consumer GPU deployment (TI2V-5B).

Pick LTX-2 if…

You want production video pipelines, camera-controlled generation, or depth/pose-driven workflows.

Specifications

Maker
Alibaba (Tongyi Lab)
Lightricks
Source Type
Open Source
Open Source
License
Apache 2.0
Apache 2.0 (<$10M rev)
Architecture
DiT + MoE (2-expert: high-noise + low-noise)
DiT + 3D Causal VAE
Parameters
27B total (14B active per step, 2x14B experts)
19B (14B video + 5B audio)
Max Resolution
720p
4K
Max Duration
10-15s
20s
FPS
24
Up to 50
Native Audio
No
Yes
ComfyUI Support
Yes
Yes
Fine-tunable
Yes
Yes
Min VRAM
8GB (small) / 24GB (full)
12GB+ (distilled) / 24GB+ (full)
Cost / Second
Self-host
$0.04
Inputs
T2V (A14B), I2V (A14B), TI2V (5B), S2V (14B)
T2V, I2V, V2V, Audio-to-Video, Depth, OpenPose, Camera Control
On Floyo
Yes
Yes

Strengths & Trade-offs

Wan 2.2

Strengths

  • +First MoE in video diffusion
  • +27B total but only 14B active per step
  • +high-noise expert for layout + low-noise for detail
  • ++65.6% more images and +83.2% more video training data vs 2.1
  • +cinematic aesthetic control (lighting, composition, contrast, color tone)

Trade-offs

  • -720p cap
  • -MoE needs careful threshold tuning (SNR-based)
  • -no native audio in base model (S2V is separate)
  • -newer ecosystem than 2.1

Best For

  • Self-hosted production
  • cinematic style control
  • speech-to-video
  • consumer GPU deployment (TI2V-5B)

LTX-2

Strengths

  • +19B params (14B video + 5B audio)
  • +native 4K at 50fps
  • +first open model with unified audio-video
  • +30 cinematic camera moves
  • +depth-aware generation

Trade-offs

  • -Superseded by 2.3 on detail and audio quality
  • -LoRAs not compatible with 2.3
  • -texture drift every 8-10 frames
  • -in-scene text issues

Best For

  • Production video pipelines
  • camera-controlled generation
  • depth/pose-driven workflows
  • budget 4K content