AI can remap speakers’ faces to put words in their mouths

July 20, 2017Miscellaneousjnack

There’s just no way this ends badly. No possible way.

Given audio of President Barack Obama, we synthesize a high quality video of him speaking with accurate lip sync, composited into a target video clip. Trained on many hours of his weekly address footage, a recurrent neural network learns the mapping from raw audio features to mouth shapes. Given the mouth shape at each time instant, we synthesize high quality mouth texture, and composite it with proper 3D pose matching to change what he appears to be saying in a target video to match the input audio track.

NewImage

One thought on “AI can remap speakers’ faces to put words in their mouths”

Benjamin Hansen says:

July 31, 2017 at 5:52 pm

creepy…

Reply

Leave a Reply Cancel reply