This is a major innovation in the field of artificial intelligence (AI).
OpenAI, the publisher of ChatGPT and the DALL-E image generator, has unveiled a new tool, called “Sora”, capable of creating realistic videos up to a minute long by simply entering text.
Based on previous research carried out on the DALL-E and GPT programs, this new platform is still being tested, specifies the AI giant on its site, which however presented some videos and their genesis.
The program can thus generate videos “while maintaining visual quality and respecting user requests”, indicated the Californian start-up allied with Microsoft on its website.
Introducing Sora, our text-to-video model.
Sora can create videos of up to 60 seconds featuring highly detailed scenes, complex camera motion, and multiple characters with vibrant emotions.
https://t.co/7j2JN27M3W
Prompt: “Beautiful, snowy… pic.twitter.com/ruTEWn87vf
— OpenAI (@OpenAI) February 15, 2024
Sora can “generate complex scenes with several characters, specific types of movements and precise details”, details the start-up on its site.
It also allows you to create a video from a still image, assures the AI giant, or to extend existing videos.
Sam Altman, the boss of OpenAI, declared on the social network
It also invited users to make suggestions for generating videos.
here is a better one: https://t.co/WJQCMEH9QG pic.twitter.com/oymtmHVmZN
— Sam Altman (@sama) February 15, 2024
He later broadcast the most relevant ones on the platform: one on which we see two dogs frolicking in the snow in the mountains, another showing the flight of an imaginary animal, half duck, half dragon, in front of a magnificent sunset, with a hamster on his back dressed in a sporty outfit.
The security issue
Sora serves as the basis for “programs capable of understanding and simulating the real world”, explains the Californian start-up, which hopes that it “will constitute an important step in the realization of AGI”, Artificial General Intelligence , a highly autonomous system that would outperform humans in most economically profitable tasks.
OpenAI warned that the “current model” of the platform had “flaws” with confusion between left and right or the inability to maintain visual continuity throughout the video.
“For example, a person may take a bite of a cookie, but afterwards, the cookie may not have any bite marks,” explains the editor.
Also read: $7,000 billion to make chips: 5 minutes to understand Sam Altman's crazy bet
By unveiling this new tool, OpenAI affirmed that the question of security constituted an essential issue and that simulations would be organized with users challenged to produce malfunctions or to create inappropriate content, in order to better define the limits of the platform.
“We will engage policymakers, educators and artists around the world to understand their concerns and identify positive use cases for this new technology,” the company said.
Meta, Google and Runway AI, which work on similar so-called “text-to-video” applications, have already presented samples.