Yissum - Research Development Company of the Hebrew University

Dynamic Temporal Alignment of Speech to Lips in Post-Production

Posted by Yissum - Research Development Company of the Hebrew UniversityResponsive · Innovative Products and Technologies · Israel

Summary of the technology

Software method for automated dialogue replacement - which is what happens at the movies when at post-production a new new dialogue is added to the film
Project ID : 10-2018-4669

Yissum - Research Development Company of the Hebrew University

Description of the technology


Audio-visual speech, Video-audio automatic alignment, shared representation

Current development stage

General list: TRL4 Technology validated in lab


  • In movie filming, poor sound quality is very common. Many speech segments are re-recorded in a studio during post-production to compensate for the poor sound quality that was recorded on location.
  • The current compensating approaches are very tedious, they require much time and effort by the actor, director, recording engineer and the sound editor.
  • The most challenging part is aligning the newly-recorded audio to the actor’s original lip movement, as viewer are very sensitive to audio-lip discrepancies. This alignment is especially difficult when the original on-set speech is unclear.

Our Innovation

A novel audio to video alignment method that automates speech to lips alignment by stretching and compressing the audio signal to match the lip movements.

  • Accurate audio to video alignment, even when the original voice is unclear.
  • Compensate for cases where a constant shift of the sound can not give a perfect alignment.
  • Dynamic temporal alignment method.
  • Improved performance over existing methods


  • Temporally align audio and video of speaking person by using innovative deep audio-visual features to map the lips video and the speech signal to shared representation.
  • Based on this shared representation, the lip-sync error between every short speech period and every video frame is computerize, followed by the determination of the optimal corresponding frame for each short sound period over the entire video clip.
  • Successful alignment was demonstrated, both quantitatively, using a human perception-inspired metric, and qualitatively.

Fig. 1: Given a speech video and a segment of corresponding, but unaligned video, the video is aligned to match the lip movements


  • Movie production industry
  • TV industry
  • Video clips in other platforms

Project manager

Aviv Shoher

Project researchers

Shmuel Peleg
HUJI, School of Computer Science and Engineering
Computer Science

Related keywords

  • Information Processing, Information System, Workflow Management
  • IT and Telematics Applications
  • Multimedia
  • Computers
  • Computer Graphics Related
  • Specialised Turnkey Systems
  • Scanning Related
  • Peripherals
  • Computer Services
  • Computer Software Market
  • Other Computer Related
  • Computer Science & Engineering
  • algorithms

About Yissum - Research Development Company of the Hebrew University

Technology Transfer Office from Israel

Yissum Research Development Company of the Hebrew University of Jerusalem Ltd. Founded in 1964 to protect and commercialize the Hebrew University’s intellectual property. Ranked among the top technology transfer companies, Yissum has registered over 8,900 patents covering 2,500 inventions; has licensed out 800 technologies and has spun-off 90 companies. Products that are based on Hebrew University technologies and were commercialized by Yissum generate today over $2 Billion in annual sales.

Send your request

By clicking "Send your request" you are signing up and accepting our Terms of Service and Privacy policy

Technology Offers on Innoget are directly posted and managed by its members as well as evaluation of requests for information. Innoget is the trusted open innovation and science network aimed at directly connect industry needs with professionals online.