“Talking face generation aims to synthesize a sequence of face images that correspond to given speech semantics. However, when people talk, the subtle movements of their face region are usually a complex combination of the intrinsic face appearance of the subject and also the extrinsic speech to be delivered. Existing works either focus on the former, which constructs the specific face appearance model on a single subject; or the latter, which models the identity-agnostic transformation between lip motion and speech. In this work, integration of both aspects and enabled arbitrary-subject talking face generation by learning has disentangled audio-visual representation.”
TLDR; It takes the words you say and turns them into pictures of your face saying them, and can also make the pictures look like different people, so you can pretend to be someone else!