LTX-2 blew up! The world's first 4K video generation model with synchronized audio and video, supported by ComfyUI!

"AI is no longer just generating images, it's starting to direct movies."

Just today, a landmark breakthrough in AI video - theLightricks Officially Releases LTX-2A new product that can beGenerate 20-second, 4K resolution, 50fps, narrative-grade HD video with full sound effects and lip sync in a single passThe generative model of the

More to the point:It's already online at ComfyUIIt supports text/image input, produces a movie in seconds, and runs locally!

If Sora is the "trailer for the future," then LTX-2 is the "trailer for the future.Creative tools that actually work--It makes "Write a paragraph → get a short movie" a reality.

🎬 What is LTX-2? It's not just video generation, it's "director-level creation"!

LTX-2 by renowned creative software company Lightricks(the team behind Facetune and Videoleap), it is currently theFirst to synchronize picture and sound in a single diffusion processof the video model.

Core Breakthrough:

  • Synchronized audio and video generation: Characters speak with mouths that match their voices, sound effects that synchronize with flashes of light when they explode, and walking rhythms that match their footsteps;
  • Native 4K / 50fps output: Exceeds the traditional 24fps standard for film and television, with no flickering or structural breakup of the image;
  • Multi-modal inputs: Plain text, image, and sketch drivers are supported;
  • Fine Director Control: You can specify the camera path, object movement, lighting style, and clip tempo;
  • Full Open Source Program: Model weights, codes, and benchmarks will be available in the Late November 2025 Open Source.;
  • local operation: RTX 4090 or Mac Studio ready to deploy, no cloud dependencies.

It's not an "AI animated toy," it's an "AI animated toy.Professional-grade tools that can be used directly in commercials, sketches, and movie previewsThe

🧪 Real Life Use Case: What can LTX-2 really do?

LTX-2 is not just a "video generator", but an "AI director" that truly understands the relationship between camera language, pacing, mood and sound. Below are five representative test cases, all based on the Prompt you provide, generated by LTX-2 at one time -The graphics, action, dialogue, sound effects, and camera movement are all synchronizedThe

✅ Case 1: Escape from the Night Streets of New York - Cinematic Tension Pulled Full Circle

Prompt::

The man says silently: "We need to run." The camera zooms in on his mouth then immediately screams: "NOW!". The camera zooms back out, he turns around, and starts running away, the camera tracks his run. The camera zooms back out, he turns around, and starts running away, the camera tracks his run in hand held style. the camera cranes up and show him run into The camera cranes up and show him run into the distance down the street at a busy new york night.

in the end::

  • The opening is silent as the camera slowly pushes in closer to the man's lips, his breath trembling slightly;
  • "NOW!" The moment it erupts, the sound effects and camera jerk away as streetlights flicker and traffic roars;
  • The hand-held heel-toe running action is natural and fluid, with footsteps that match the rhythm of the gasps;
  • The final crane up shot pulls up as the man's silhouette fades away through the neon streets of New York, with layers of ambient sound (sirens, horns, crowds) in the background.

This is not an "AI animation", it's film-quality footage that can be used directly in the opening scene of an action movie.

✅ Case 2: Monster Truck Rampage - Motion blur meets lens tracking in perfect combination

Prompt::

an action packed, cinematic shot of a monster truck driving fast towards the camera, the truck passes the camera as it pans left to follow the truck's reckless drive. The truck then drifts and turns around, then drives back towards the camera until seen in extreme distance. The truck then drifts and turns around, then drives back towards the camera until seen in extreme close up.

in the end::

  • The truck rushes head-on, tires swirl dust, and the camera shakes violently to simulate a handheld shot;
  • The image naturally produces motion blur and depth of field changes as the vehicle passes by;
  • The drift slalom moves smoothly, with the engine roar synchronized with the sound of the tires rubbing together;
  • Eventually the extreme close-up shot settles on the headlights of the car, reflecting distorted light and shadows, and the sound effects stop abruptly to create dramatic tension.

LTX-2's understanding of "speed" and "physical feedback" is close to the level of professional special effects teams.

✅ Case 3: Daytime Talk Show - Emotional Tension and Editing Pacing Precision

Prompt::

NT. DAYTIME TALK SHOW SET - AFTERNOON
Soft studio lighting glows across a warm-toned set. The audience murmurs faintly as the camera pans to reveal three guests seated on a couch - a middle-aged couple and the show's host sitting across from them. a middle-aged couple and the show's host sitting across from them.
The host leans forward, voice steady but probing.
Host: "When did you first notice that your daughter, Missy, started to spiral?"
The woman's face crumples; she takes a shaky breath and begins to cry. Her husband places a comforting hand on her shoulder, looking down before turning back toward the host. Her husband places a comforting hand on her shoulder, looking down before turning back toward the host.
Father (quietly, with guilt): "We... we don't know what we did wrong."
The studio falls silent for a moment. The camera cuts to the host, who looks gravely into the lens.
Host (to camera): "Let's take a look at a short piece our team prepared - chronicling Missy's downward path. downward path."
The lights dim slightly as the camera pushes in on the mother's tear-streaked face. The studio monitors flicker to life, beginning to play the segment as the audience holds its breath. The studio monitors flicker to life, beginning to play the segment as the audience holds its breath.

in the end::

  • The image restores the classic daytime talk show lighting and color palette, with warm yellow soft light creating a depressing atmosphere;
  • The mother cries with delicate facial micro-expressions and natural hand movements of the husband;
  • The presenter turns to the "fourth wall" of the camera to break it, speaking in a calm tone and looking directly at the audience;
  • As the camera pushes closer to the mother's tearful face, the background sound fades, leaving only the sound of breathing;
  • bottom line is this.When the host says "Let's take a look...", the LTX-2 automatically generates a "movie within a movie" transition - the screen lights up, the picture switches, the audience holds its breath - and the audience is left breathless. -The screen lights up, the picture switches, the audience holds their breath--The whole process is done in one go, no need for post-production splicingThe

It's not "video generation" anymore, it'sAutomatic Construction of Narrative StructuresThe

✅ Case 4: Absurd Family Drama - Dead Side Humor and Visual Contrasts Pulled Together

Prompt::

A warm sunny backyard. The camera starts in a tight cinematic close-up of a woman and a man in their 30s, facing each other with serious expressions. woman, emotional and dramatic, says softly, "That's it... Dad's lost it. Dad's lost it. And we've lost Dad."
The man exhales, slightly annoyed: "Stop being so dramatic, Jess."
He glances aside, then mutters defensively, "He's just having fun."
The camera slowly pans right, revealing the grandfather in the garden wearing enormous butterfly wings, waving his arms in the air like he's trying to take off.
He shouts, "Wheeeew!" as he flaps his wings with full commitment.
The woman covers her face, on the verge of tears. The tone is deadpan, absurd, and quietly tragic.

in the end::

  • The opening scene has the two men facing each other in a depressing mood as the camera slowly pans right;
  • The grandfather bursts into the painting wearing huge butterfly wings, his movements exaggerated but rhythmically precise;
  • "Wheeeew!" Shouted out as the flapping of wings brought up a slight breeze and the leaves shook slightly;
  • The reaction of the daughter covering her face and the son rolling his eyes is real and natural;
  • The soundtrack is accompanied by a soft guitar soundtrack, which creates a "dead-side comedy" contrast with the absurdity of the images.

LTX-2 manages to capture the subtle tone of "absurd yet tragic" - the hardest part of high comedy.

✅ Case 5: Pixar Style Oven Theater - Anthropomorphization + Dramatization + Synchronization of Audio and Video

Prompt::

Static camera from inside the oven, looking outward through the slightly fogged glass door. Warm golden light glows around freshly baked cookies. Warm golden light glows around freshly baked cookies. The baker's face fills the frame, eyes wide with focus, his breath fogging the glass as he leans in. Subtle reflections move across the glass as steam rises.
Baker (whispering dramatically): "Today... I achieve perfection."
He leans even closer, nose nearly touching the glass.
"Golden edges. Soft center. The gods themselves will smell these cookies and weep."
Baker: "Wait-"
(beat)
"Did I... forget the chocolate chips?"
Cut to side view - coworker pops into frame, chewing casually.
Coworker (mouth full): "Nope. You forgot the sugar."
Quick zoom back to the baker's horrified face, pressed against the oven door, as cookies deflate behind the glass. steam drifts upward in Steam drifts upward in slow motion.
pixar style acting and timing

in the end::

  • The shot is taken from inside the oven looking out, with realistic details of glass fog, steam, and reflections;
  • The baker's expression is exaggerated but not overdone, and his eyes cascade from frenzy to devastation;
  • "The gods themselves will smell these cookies and weep" to a solemn soundtrack;
  • A coworker suddenly enters the picture and chews with his mouth full: "Nope. You forgot the sugar."--Mouth shape, chewing sound and swallowing action are perfectly synchronized.;
  • The cookie collapses and steam rises in slow motion, with a heartbreaking "ding" sound effect.Pixar-style timing is accurately reproducedThe

After the video was released in the community, netizens called it "the most healing and heart-wrenching AI short film of the year".

🛠️ How does it work? ComfyUI is a one-click process.

LTX-2 has been adopted asOfficial Cooperation NodeIntegration into ComfyUI with a very low barrier to use:

The steps are as follows:

  1. Update ComfyUI to the latest version (make sure the video module is supported);
  2. Search in the template library "LTX-2".;
  3. Select the mode:
  • Fast Mode: 6-10 seconds of video, good for quick previews;
  • Pro Mode: High quality output, suitable for commercials/short films;
  1. Enter the prompt word, for example:
    a dancer under neon light, cinematic, 4K, 50fps
  2. Setting parameters: resolution (720p~4K), frame rate (up to 50fps), duration (6/8/10 seconds);
  3. Click Run.Film in 10 seconds or less.The

Advanced users can also combine ControlNet and VHS nodes to realize complex processes such as multi-camera stitching and style migration.

⚖️ Strengths and Limitations

✅ Strengths:

  • sound and picture synchronization: An industry first, say goodbye to "post-dubbing";
  • Fast reasoningThe 10-second video is generated in just over ten seconds;
  • Physical realism: Skin, metal, and fabric textures are realistic;
  • Director level control: Shots, tempo, and style are all adjustable;
  • Open source + local operation: Privacy and security with no platform lock-in.

⚠️ limitations:

  • time limit: Currently up to 10 seconds (the official platform supports 20 seconds);
  • Audio bias "reference level": Suitable for ambient sound effects, not a substitute for professional soundtracks for the time being;
  • High video memory requirements: 4K output requires an RTX 4090-class GPU;
  • Cue word sensitivity: Vague descriptions are easily "biased" and need to be expressed with precision.

🔗 How was the experience?

  • Online Experience::https://ltx.video/
  • ComfyUI node: Template Library Search "LTX-2"
  • Supported Platforms: Fal, Replicate, RunDiffusion, ComfyUI
  • open source program: Open model weights and code late November 2025

🎥 Write in the end

In the past, we said "AI-generated video" was a gimmick;
Now, LTX-2 proof:AI has been able to participate in the real creative processThe

It may not be perfect, but the direction is unmistakable-
Let the creativity is no longer bound by the technical threshold, so that the idea can be turned into a picture in seconds.

If you want to "direct" your own 4K movie, you can do it yourself.
Now is the best time to do it.

Open ComfyUI and type in your first prompt word.
The world, will move for you.

For more products, please check out

See more at

ShirtAI - Penetrating Intelligence The AIGC Big Model: ushering in an era of dual revolution in engineering and science - Penetrating Intelligence
1:1 Restoration of Claude and GPT Official Website - AI Cloud Native Live Match App Global HD Sports Viewing Player (Recommended) - BlueShirt.com
Transit service based on official API - GPTMeta API Help, can anyone of you provide some tips on how to ask questions on GPT? - Knowing
Global Virtual Goods Digital Store - Global SmarTone (Feng Ling Ge) How powerful is Claude airtfacts feature that GPT instantly doesn't smell good? -BeepBeep