AOpen

LTX-2

Lightricks

The predecessor to 2.3. 19B params, native 4K + audio, 30 camera moves. Still excellent and widely deployed.

AOpen

Wan 2.1

Alibaba (Tongyi Lab)

The foundation that started it all. 1.3B variant runs on virtually any GPU. First open model to beat closed-source across benchmarks.

Pick LTX-2 if…

You want production video pipelines, camera-controlled generation, or depth/pose-driven workflows.

Pick Wan 2.1 if…

You want consumer GPU workflows, academic research, or Chinese + English text-in-video.

Specifications

Maker
Lightricks
Alibaba (Tongyi Lab)
Source Type
Open Source
Open Source
License
Apache 2.0 (<$10M rev)
Apache 2.0
Architecture
DiT + 3D Causal VAE
Flow Matching DiT + 3D Causal VAE
Parameters
19B (14B video + 5B audio)
14B (also 1.3B variant)
Max Resolution
4K
720p
Max Duration
20s
5s
FPS
Up to 50
24
Native Audio
Yes
No
ComfyUI Support
Yes
Yes
Fine-tunable
Yes
Yes
Min VRAM
12GB+ (distilled) / 24GB+ (full)
8.19GB (1.3B) / 24GB+ (14B)
Cost / Second
$0.04
Self-host
Inputs
T2V, I2V, V2V, Audio-to-Video, Depth, OpenPose, Camera Control
T2V (14B/1.3B), I2V (14B), FLF2V, VACE, V2A
On Floyo
Yes
Yes

Strengths & Trade-offs

LTX-2

Strengths

  • +19B params (14B video + 5B audio)
  • +native 4K at 50fps
  • +first open model with unified audio-video
  • +30 cinematic camera moves
  • +depth-aware generation

Trade-offs

  • -Superseded by 2.3 on detail and audio quality
  • -LoRAs not compatible with 2.3
  • -texture drift every 8-10 frames
  • -in-scene text issues

Best For

  • Production video pipelines
  • camera-controlled generation
  • depth/pose-driven workflows
  • budget 4K content

Wan 2.1

Strengths

  • +SOTA open-source at launch
  • +1.3B model runs on any consumer GPU (8.19GB VRAM)
  • +first video model with Chinese + English text generation
  • +Wan-VAE encodes unlimited-length 1080P
  • +T2V/I2V/Video Editing/T2I/V2A all supported

Trade-offs

  • -720p max
  • -5s duration
  • -1.3B quality limited
  • -no native audio generation
  • -superseded by 2.2 on quality

Best For

  • Budget local deployment
  • consumer GPU workflows
  • academic research
  • Chinese + English text-in-video

Run these models on Floyo

Browser-based ComfyUI. No setup, no GPU required.

Open Floyo →