Text-to-video tests with three generative models
Text-to-video tests with three generative models
A set of text-to-video render tests was run using Sora 2, Kling 3.0, and Seedance 2.0 in that precise order.
Test setup and creative brief
The prompt described an ultra-realistic cinematic machine where hundreds of polished steel marbles traverse a handcrafted musical mechanism.
Materials were specified as tactile and premium, including birch wood, brushed brass, polished steel, felt dampers, rubber belts, and machined gears.
Camera direction asked for a slow macro push-in following a lead marble through layered interactions, then widening to a semi-wide shot of synchronized pathways.
Visual and physical constraints
Physics fidelity was requested: realistic weight, friction, momentum, collision, and slightly imperfect motion without exaggerated or cartoon-like behavior.
Lighting guidance combined warm workshop and stage characteristics with volumetric dust, rich shadows, bright steel glints, and natural reflections on varnished wood.
Audio and motion design
Sound design prioritized crisp mechanical percussion, resonant vibraphone tones, delicate bells, light drum taps, gear whirs, and subtle room resonance.
Motion should be smooth, hypnotic, and precise, emphasizing visible cause-and-effect mechanics and synchronized timing across multiple marble lanes.
Model outputs
- Sora 2 produced dense, textured surfaces and rich reflections, while handling close macro detail with consistent tactile materials.
- Kling 3.0 prioritized broader cinematic framing and atmospheric lighting, at times simplifying micro-physics of collisions and marble friction.
- Seedance 2.0 emphasized smooth choreography and synchronized pathways, sometimes at the expense of micro-level material realism and contact dynamics.
Across models, trade-offs appeared between microscopic physical accuracy and cinematic composition, with each system favoring different aspects of the brief.
Results indicate varied strengths depending on whether priority was material realism, atmospheric lighting, or synchronized mechanical choreography.
Related posts

