Floyo Model Compare — Find the right AI video model

AOpen

Wan 2.2

Alibaba (Tongyi Lab)

MoE architecture with 27B total params but only 14B active. Trained on 65% more images and 83% more video than 2.1. Outperforms leading closed-source models on Wan-Bench 2.0.

AOpen

SkyReels V2

Skywork AI

First open model for infinite-length video. Best human rendering in open source.

BOpen

HunyuanVideo 1.5

Tencent

Punches above its weight. Avatar and Custom variants add versatility.

Pick Wan 2.2 if…

You want cinematic style control, speech-to-video, or consumer GPU deployment (TI2V-5B).

Pick SkyReels V2 if…

You want long-form AI films, human-centric content, or short films and ads.

Pick HunyuanVideo 1.5 if…

You want avatar gen, custom characters, or research.

Specifications

Maker

Alibaba (Tongyi Lab)

Skywork AI

Tencent

Source Type

Open Source

License

Apache 2.0

Open Source

Tencent Open (check terms)

Architecture

DiT + MoE (2-expert: high-noise + low-noise)

AR Diffusion-Forcing

DiT + 3D Causal VAE

Parameters

27B total (14B active per step, 2x14B experts)

14B

8.3B

Max Resolution

720p

Max Duration

10-15s

30s+ (infinite)

6-10s

FPS

Native Audio

ComfyUI Support

Yes

Fine-tunable

Yes

Min VRAM

8GB (small) / 24GB (full)

40GB+ (A100/H100)

24GB (RTX 4090)

Cost / Second

Self-host

$0.06

Inputs

T2V (A14B), I2V (A14B), TI2V (5B), S2V (14B)

T2V, I2V

T2V, I2V, Avatar

On Floyo

Yes

Strengths & Trade-offs

Wan 2.2

Strengths

+First MoE in video diffusion
+27B total but only 14B active per step
+high-noise expert for layout + low-noise for detail
++65.6% more images and +83.2% more video training data vs 2.1
+cinematic aesthetic control (lighting, composition, contrast, color tone)

Trade-offs

-720p cap
-MoE needs careful threshold tuning (SNR-based)
-no native audio in base model (S2V is separate)
-newer ecosystem than 2.1

Best For

→Self-hosted production
→cinematic style control
→speech-to-video
→consumer GPU deployment (TI2V-5B)

SkyReels V2

Strengths

+First infinite-length open model
+10M+ film/TV training
+best open human faces

Trade-offs

-Heavy GPU req
-no ComfyUI
-newer community
-720p cap

Best For

→Long-form AI films
→human-centric content
→short films and ads

HunyuanVideo 1.5

Strengths

+Efficient 8.3B params
+~75s on 4090
+Avatar + Custom variants

Trade-offs

-Short output
-720p max
-no audio

Best For

→Avatar gen
→custom characters
→research
→consumer GPU

Run these models on Floyo

Browser-based ComfyUI. No setup, no GPU required.

PreprocessingWan 2.2

Wan 2.2 Animate Preprocess (Kijai)

V2VWan 2.2

Wan 2.2 + Qwen V2V Restyle

T2VWan 2.2

Wan 2.2 T2V with UnifiedRew

CharacterWan 2.2

Wan 2.2 Animate Character Replacement

Audio/FoleyHunyuanVideo 1.5

HunyuanVideo Foley (Lifelike Audio)

I2VHunyuanVideo 1.5

HunyuanVideo 1.5 Image to Video

Open Floyo →