Google, together with the Weizmann Institute of Science and Tel Aviv University, has developed a new artificial intelligence model for creating videos from photos and text instructions.
It's called Lumiere, a clear homage to the French brothers who invented the movie camera and the cinema projector.
The novelty of Lumiere IA, from a technological point of view, is in the quality with which the software manages to recreate the movement of the subjects within the film.
Programs like Stable Diffusion often show imperfections due to the difficulty of maintaining a certain consistency in making static images animated.
In this regard, Google has developed an architecture called "Space-Time U-Net", a space-time network, which generates the entire video in a single step, without creating intermediate sequences, which would present possible inconsistencies with the actions previous and subsequent ones.
The bulk of the work is done by the generative artificial intelligence which chooses the best movement after analyzing several, based on the vast database to which the Big G models have access, to return a plausible video.
An example is when we insert a photo of a stuffed animal into Lumiere and ask the program to make it walk from point A to point B. By creating a single space-time sequence, the AI generates a movie in which each activity is closely linked to one another, with greater harmony.
Lumiere's video model was trained on a dataset of 30 million videos, along with their text captions.
At the moment it is not software open to the public but only an experimental research project.
Reproduction reserved © Copyright ANSA