Floyo Model Compare — Find the right AI video model

AOpen

Wan 2.1

Alibaba (Tongyi Lab)

The foundation that started it all. 1.3B variant runs on virtually any GPU. First open model to beat closed-source across benchmarks.

AOpen

LTX-2

Lightricks

The predecessor to 2.3. 19B params, native 4K + audio, 30 camera moves. Still excellent and widely deployed.

Pick Wan 2.1 if…

You want consumer GPU workflows, academic research, or Chinese + English text-in-video.

Pick LTX-2 if…

You want production video pipelines, camera-controlled generation, or depth/pose-driven workflows.

Specifications

Maker

Alibaba (Tongyi Lab)

Lightricks

Source Type

Open Source

License

Apache 2.0

Apache 2.0 (<$10M rev)

Architecture

Flow Matching DiT + 3D Causal VAE

DiT + 3D Causal VAE

Parameters

14B (also 1.3B variant)

19B (14B video + 5B audio)

Max Resolution

720p

Max Duration

20s

FPS

Up to 50

Native Audio

Yes

ComfyUI Support

Yes

Fine-tunable

Yes

Min VRAM

8.19GB (1.3B) / 24GB+ (14B)

12GB+ (distilled) / 24GB+ (full)

Cost / Second

Self-host

$0.04

Inputs

T2V (14B/1.3B), I2V (14B), FLF2V, VACE, V2A

T2V, I2V, V2V, Audio-to-Video, Depth, OpenPose, Camera Control

On Floyo

Yes

Strengths & Trade-offs

Wan 2.1

Strengths

+SOTA open-source at launch
+1.3B model runs on any consumer GPU (8.19GB VRAM)
+first video model with Chinese + English text generation
+Wan-VAE encodes unlimited-length 1080P
+T2V/I2V/Video Editing/T2I/V2A all supported

Trade-offs

-720p max
-5s duration
-1.3B quality limited
-no native audio generation
-superseded by 2.2 on quality

Best For

→Budget local deployment
→consumer GPU workflows
→academic research
→Chinese + English text-in-video

LTX-2

Strengths

+19B params (14B video + 5B audio)
+native 4K at 50fps
+first open model with unified audio-video
+30 cinematic camera moves
+depth-aware generation

Trade-offs

-Superseded by 2.3 on detail and audio quality
-LoRAs not compatible with 2.3
-texture drift every 8-10 frames
-in-scene text issues

Best For

→Production video pipelines
→camera-controlled generation
→depth/pose-driven workflows
→budget 4K content

Run these models on Floyo

Browser-based ComfyUI. No setup, no GPU required.

T2VLTX-2

LTX 2 19B Fast Text to Video

3.6k runs

Open Floyo →