With modern technology improving so fast, it seems that we have no time left to stop and think about whether this progress is useful or simply toxic for our society. Recently, many people have shown their concern about the “deepfakes” that have been surfacing across the Internet. Simply put, deepfakes provide a possibility to put a face on another person in a video, which gives a lot of opportunities to create illegal fake content without a person’s consent. Now, this technology has improved even more and it’s amazing yet scary at the same time.
More info: Egor Zakharov
Recently, Samsung AI lab developers in Moscow have created a new tool that brings deepfakes to a new level
They have created what can be called “living portraits”, and all they need is a single picture that they can turn into a fake video.
This discovery is called few-shot and one-shot learning and all it needs is a single image to turn it into an animated portrait. In fact, if a few more shots are provided, the result is even more striking.
But what surprised everyone the most was the fact that this technology can easily animate the most iconic images
In their recent paper, researchers explained their goals and reasons why they are creating this technology: “Several recent works have shown how highly realistic human head images can be obtained by training convolutional neural networks to generate them. In order to create a personalized talking head model, these works require training on a large dataset of images of a single person. However, in many practical scenarios, such personalized talking head models need to be learned from a few image views of a person, potentially even a single image. Here, we present a system with such few-shot capability.”
“It performs lengthy meta-learning on a large dataset of videos, and after that is able to frame few- and one-shot learning of neural talking head models of previously unseen people as adversarial training problems with high capacity generators and discriminators.”
“Crucially, the system is able to initialize the parameters of both the generator and the discriminator in a person-specific way, so that training can be based on just a few images and done quickly, despite the need to tune tens of millions of parameters. We show that such an approach is able to learn highly realistic and personalized talking head models of new people and even portrait paintings.”
In the presentational video, the researchers show that the larger the amount of photos from different angles provided, the more realistic the result.
“Currently, the key limitations of our method are the mimics representation (in particular, the current set of landmarks does not represent the gaze in any way) and the lack of landmark adaptation. Using landmarks from a different person leads to a noticeable personality mismatch. So, if one wants to create “fake” puppeteering videos without such mismatch, some landmark adaptation is needed,” the developers added in their paper while describing this technology.