Built by Shengshu Technology with Tsinghua University Vidu is made to speed up video creation for all sorts of uses - film animation ads you name it.
Vidu AI works in three ways:
Text to Video. Type in a description and get a matching video.
Image to Video. Feed it a picture and watch it move.
Reference to Video. Keeps characters objects and backgrounds consistent using a reference.
One of its big claims? Speed. Vidu AI says it can spit out a video in lower resolution in just 10 seconds. And it's not just about being fast—it understands meaning quite so the results actually match what you asked for in your prompt. Plus it's designed to make motion look natural no stiff robotic movements.
The latest update from Vidu, bringing forth the multiple elements videos powered by Q1 model is on another level. The crispiness, the fidelity, the details preservation is amagzing and a big leap from company's previous quality.
Use Vidu invite code 1bc88288 for bonus credits + new user perks!
Educators and TrainersCreative ProfessionalsContent CreatorsMedia and Film MakersMarketing and Branding SpecialistsDevelopers and Tech CreatorsSmall Business OwnersEntertainment and Performance ArtistsProfessional Content Creators
This list may not be exhaustive as new models keep dropping and are added to platforms all the time.
Prompt:
A photorealistic eye-level timelapse of a minimalist bathroom with wall featuring a mosaic text "AIcreators.tools" and a raw grey concrete floor, camera static except the final shot. Step 1 (Arrangement): Fixed Camera. The worker rapidly spreads natural coastal base across the floor evenly and neatly with his hands across the entire surface. The whole floor ends up being covered evenly with a neat layer of sea stuff. Step 2 (The Pour): Fixed Camera. The worker begins at the far end of the room and pours buckets of thick, transparent epoxy resin in a continuous controlled flood, slowly covering the surface and encapsulating the sea stuff in a deep, crystal-clear liquid layer. As he pours, the worker is seen from his back deliberately walking backwards toward the camera/exit. The resin magnifies textures and creates the illusion of coastal water. Step 3 (Finishing): Fixed Camera. Using a high-speed floor buffer, the worker polisesh the resin surface until it reaches a flawless, glass-like finish, enhancing depth, light refraction, and oceanic translucency. Step 4 (Furnishing): Fixed Camera. Fast-paced placement of a sleek blue vanity with a large mirror near the wall with a mosaic "AIcreators.tools" text, and a modern freestanding blue bathtub in the center of the room.
Closing: The camera slowly pans across the floor, revealing a breathtaking ocean-floor effect — layered shells and coral preserved beneath a glossy, water-clear surface, with soft ambient light shimmering across the encapsulated marine textures like sunlight filtering through sea water.
Woman is walking forward while subtly fixing her hair and smiling, slightly leans forward to look straight into the camera, waves and says "Hi, I'm just testing Vidu's new Q 3 video model!" Footsteps sound. Suddenly, in the end, confetti explosion happens and she jerks and gasps in surprise looking around with exaggerated amazement
Generated on February 1, 2026:
Followed this simple prompt well, slightly garbled the Vidu Q3 words, ironically
A cute, high-quality miniature figurine Christmas ornament inspired by the attached subject / pet reference image, hanging from a beautifully decorated Christmas tree.
The subject is clearly a small premium toy-like figurine, not a real animal — slightly stylized proportions (subtle chibi influence: gently rounded head, simplified paws, softened edges), while accurately preserving the subject’s unique facial features, markings, ear shape, and expression so the likeness remains instantly recognizable.
The figurine is made from clearly artificial materials — painted resin / polymer clay / molded plastic, with visible handcrafted texture, tiny brush strokes, soft seams, and a smooth satin finish.
Fur is sculpted, not real: simplified grooves and embossed shapes instead of individual hairs.
The ornament hangs by a small metallic hook attached to the figurine’s head but concealed with a smal red ribbon bow, making it unmistakably a decorative object.
Scale is obvious: the figurine is palm-sized, surrounded by oversized pine needles, fairy lights, and glass baubles to reinforce its miniature nature.
Cinematic macro product photography, shallow depth of field with warm festive bokeh.
Soft studio-style holiday lighting — warm key light, gentle fill, subtle rim light to outline the figurine’s silhouette.
Shot at ornament eye-level, 85mm macro lens look, ultra-clean focus on the figurine while the background tree softly blurs.
Explicit constraints:
– not a real animal
– not lifelike fur or skin
– no biological realism
– clearly a toy, figurine, or collectible ornament
Style: cute but premium, Pixar-adjacent holiday décor, collectible toy photography
Mood: cozy, magical, wholesome, festive
Detail level: high, but intentionally stylized
@image1 This subject @image2 presented as a premium collectible figurine on a designer’s desk. Scene: • Center: a realistic 1/6-scale figurine of the subject on a clear round stand, natural museum pose one foot slightly forward. Keep their current outfit, hair, and accessories exactly as in the source. • Right: an upright glossy retail box showing the same subject as box art matching outfit and look, with brand AI Creators Tools; clean typography and a small authenticity sticker. • Left: a widescreen monitor displaying the subject as a grayscale digital sculpt/turntable in a 3D app UI that clearly matches the figurine. • Desk props: keyboard, mouse pad, a couple of notes; tidy and minimal. Lighting & look: • Bright natural studio daylight from windows, soft shadows, subtle tabletop reflections. • Photoreal materials plastics, crisp print on the box, no duplicates or mismatches between figure, box, and monitor. Style tags: photoreal, product photography, studio lighting, sharp focus, clean composition
So yes, if you upload an additional clear headshot of your character it improves facial features clarity. Box size is now too small again (because model is huge)
@image1This subject presented as a premium collectible figurine on a designer’s desk. Scene: • Center: a realistic 1/6-scale figurine of the subject on a clear round stand, natural museum pose one foot slightly forward. Keep their current outfit, hair, and accessories exactly as in the source. • Right: an upright glossy retail box showing the same subject as box art matching outfit and look, with brand AI Creators Tools; clean typography and a small authenticity sticker. • Left: a widescreen monitor displaying the subject as a grayscale digital sculpt/turntable in a 3D app UI that clearly matches the figurine. • Desk props: keyboard, mouse pad, a couple of notes; tidy and minimal. Lighting & look: • Bright natural studio daylight from windows, soft shadows, subtle tabletop reflections. • Photoreal materials plastics, crisp print on the box, no duplicates or mismatches between figure, box, and monitor. Style tags: photoreal, product photography, studio lighting, sharp focus, clean composition
An improvement over Q1 that the box isn't much smaller than the figurine itself. Her facial features aren't too crisp, likely could be helped with an additional reference image with her closeup portrait.
@image1 A couple sits at a small white iron table outside café. They hold hands and look at each other. The shot stays steady with a light film-like grain. It starts focused on the couple, then shifts to the back. A man with a suitcase walks into view. His face shows shock. That changes the mood fast. The street has striped awnings, café chairs, and a busy but quiet flow of people. You hear street sounds, some footsteps, and soft clinks of dishes. No traffic noise or music. While all this happens, the woman says, “He’s in New York till Friday, darling.” The man says, “So I can have you all to myself.”
@image1 is walking forward on a street from@image2, then slightly leans forward to look straight into the camera, waves and says: "Hi, I'm just testing Vidu's new video model!"
Soundscape: Footsteps, sea waves.
Generated on October 22, 2025:
Test with 2 images and speech/sound in text prompt. Resemblance is there, text on t-shirt and fabric quality - all preserved. But refuses to speak the full sentence, this is the 2nd attempt and same result.
A muscular man stands confidently, arms crossed, wearing a bold purple sweatshirt emblazoned with "PUNCH ME". Text "WW.AICREATORS.TOOLS" on the wall behind him.
Suddenly, a fist enters frame and strikes his head near his cheek from the left.
The impact unfolds, a shockwave running through his body.
The camera captures his expression shifts—surprise shifting into resilience, his eyes blazing with intensity.
Ambient light catches the sweat glistening on his skin, crafting a gritty, dynamic atmosphere. Kinetic, dynamic, cinematic.
Done many variations of this prompt but this is the closest I could get to a realistic punch. Mostly his head doesn't move at all on impact and the fist barely touches him.
Realistic, preserves likeness. Main issue is text. Note that only newly introduced text is a problem, model correctly copies the text present on reference image.