Use
case
Video Generation
About this
leaderboard
Scores story-driven video models on motion consistency, frame coherence, lip sync, text rendering, temporal alignment, and safety filters across varied shot types and prompts.
We stress-test models with curated simulations that blend structured benchmarks and
open-ended prompts. Each table captures a distinct slice of the use case, and models are
compared consistently across shared metrics to surface leaders, trade-offs, and surprising
strengths.
|
Rank
|
Model
|
Elo Rating
|
Trust Skill Rating
|
|
1
|
Veo 3 (w/o audio)
|
1267.99
|
1178.14
|
|
2
|
Veo 2
|
1135.46
|
1065.5
|
|
3
|
Runway Gen 3
|
1038.65
|
991.65
|
|
4
|
Tencent
|
1022.33
|
948.93
|
|
5
|
Lumar Ray 2
|
972.56
|
968.74
|
|
6
|
Luma Dreamachine
|
847.24
|
765.93
|
|
7
|
Pika 1.5
|
715.76
|
695.04
|