What is visual dubbing?

Visual dubbing is localization that changes both the audio and the picture. The dialogue is replaced with a new-language performance and the speaker's on-screen lip movement is re-animated to match it, so the film looks originally shot in the new language. Traditional dubbing changes only the audio, which is why dubbed content normally shows mismatched lips.

What is the difference between dubbing and lipsync?

Dubbing is the replacement of a video's dialogue with a new recording. Lipsync, in the AI sense, is the visual technique that re-animates the speaker's mouth to match new audio. Dubbing without lipsync gives you translated audio over mismatched lips. Dubbing plus AI lipsync gives you visual dubbing, where picture and sound agree.

Is voice cloning the same as revoicing?

No. Revoicing is the act of replacing a voice track with any new performance. Voice cloning is a way of producing that performance in a specific person's voice, with their consent, usually by having a native voice actor drive the timing and emotion and re-rendering the result in the cloned voice. Revoicing can use a clone, a different voice actor, or a synthetic voice.

Visual Dubbing vs Lipsync vs Revoicing: What Each Term Means

Visual dubbing changes both the audio and the picture: the dialogue is replaced with a new-language performance, and the speaker's lips are re-animated so the film looks originally shot in that language. Traditional dubbing changes the audio only. AI lipsync is the technique that makes the picture change possible, and revoicing and voice cloning describe how the new audio gets made. The terms get mixed constantly in briefs and vendor pitches, so here is what each one means and how they fit together.

Table of contents:

The five terms, defined
Traditional dubbing and where it falls short
AI lipsync: changing the picture
Revoicing and voice cloning: changing the sound
How a full visual dubbing pipeline fits together
Which approach fits your project

The five terms, defined

Traditional dubbing: new-language dialogue recorded by voice actors replaces the original audio. The picture is untouched, so lips no longer match the words.
AI lipsync: machine-learning models re-animate a speaker's mouth frame by frame to match new audio, preserving identity, expression and head movement.
Visual dubbing: dubbing plus AI lipsync. Both sound and picture change, so the film appears originally shot in the new language. Also called lipsync dubbing; our product name for it is Lipsmatch.
Revoicing: replacing the voice track with a new performance, in the same or a different language, without changing the picture.
Voice cloning: producing speech in a specific person's voice from a trained model of that voice, with consent, usually driven by another actor's recorded performance.

Traditional dubbing and where it falls short

Dubbing has carried international film and television for a century, and audiences in markets like Germany, Italy and Brazil are fully fluent in watching it. Its limitation is visible in every close-up: the mouth is speaking one language while the soundtrack plays another. Long-form audiences forgive this. Advertising has a harder time, because a 30-second film lives or dies on believability, and a mismatched mouth in a tight close-up reads instantly as a foreign ad that was adapted cheaply.

That gap between what the audience hears and what it sees is the specific problem visual dubbing exists to close.

AI lipsync: changing the picture

AI lipsync tracks the speaker's facial landmarks, generates mouth movement aligned to the phonemes of the new audio, and composites the result back into the original footage frame by frame. Done well, it preserves the performer's identity, expressions and head motion while replacing only the articulation. Production-grade systems handle the conditions real shoots create: faces at an angle, hands or products crossing the mouth, several speakers in frame, fast dialogue and camera movement.

Quality varies widely across tools, and the difference shows most on exactly the shots advertising relies on: close-ups of a face the audience knows. This is why broadcast work adds native-speaker review and manual correction on difficult shots rather than shipping raw model output.

Revoicing and voice cloning: changing the sound

The audio side has its own choices. A revoice can use a new voice actor, a synthetic voice, or a cloned voice of the original talent. For brand work the cloned route is the interesting one: a native voice actor in the target language performs the dialogue, driving the timing, dialect and emotion, and that performance is re-rendered in the original performer's voice. The audience hears the familiar star speaking fluent Tamil or Spanish or German.

Cloning a real person's voice requires their informed consent and contract coverage, and serious studios treat that as a precondition rather than a formality. We used this approach for the NutriChoice campaign with Aamir Khan, where the star's own voice carries every language version.

How a full visual dubbing pipeline fits together

A complete localization of an ad film runs through five stages, and the vocabulary above maps onto them cleanly.

Adaptation: a native writer rewrites the dialogue to the picture, matching pacing and idiom rather than translating word for word.
Performance: a native voice actor records the adapted dialogue, giving the film its timing and emotional read.
Revoicing: the performance is carried into the chosen voice, often a clone of the original talent.
Lipsync: the on-screen mouth is re-animated to the new audio, completing the visual dub.
Review and finishing: native speakers check language and culture, and the film passes broadcast QC before delivery.

Each stage exists because the one before it is not enough on its own. Skipping adaptation gives you accurate but lifeless dialogue; skipping review gives you fluent dialogue nobody checked.

Which approach fits your project

Subtitles suit content where budget is minimal and the audience expects them. Traditional dubbing suits long-form content for dubbing-fluent markets. Visual dubbing earns its cost where believability drives results: advertising, celebrity campaigns, and any film where the speaker's face is the message. For a sense of what drives the cost of that choice, see our AI dubbing cost guide, or explore the language-specific pages for the markets you are targeting. Our delivered campaigns show the finished standard, and we are happy to advise on which approach fits a specific brief.

Visual Dubbing, Lipsync, Revoicing: What Each Term Actually Means

The five terms, defined

Traditional dubbing and where it falls short

AI lipsync: changing the picture

Revoicing and voice cloning: changing the sound

How a full visual dubbing pipeline fits together

Which approach fits your project

Insanely Elegant StudioProduction & Localization

Thirty minutes.
Your project, your questions.

Let's talk.

Send us a short briefing.

Briefing received.

Visual Dubbing, Lipsync, Revoicing: What Each Term Actually Means

The five terms, defined

Traditional dubbing and where it falls short

AI lipsync: changing the picture

Revoicing and voice cloning: changing the sound

How a full visual dubbing pipeline fits together

Which approach fits your project

Insanely Elegant StudioProduction & Localization

Thirty minutes.Your project, your questions.

Let's talk.

Send us a short briefing.

Thirty minutes.
Your project, your questions.