OpenAI unveils Sora: a text-to-video model that generates videos on the basis of prompts

author-image
Venkatesan
Updated On
OpenAI unveils Sora: a text-to-video model that generates videos on the basis of prompts
Advertisment
  • Sora can generate videos up to a minute long while maintaining visual quality: OpenAI
  • The model has a deep understanding of language: OpenAI
  • Sora can also create multiple shots within a single generated video:OpenAI

The Artificial Intelligence research organisation OpenAI that had introduced the large language based chatbot ChatGPT has now entered the domain of visual media. “We’re teaching AI to understand and simulate the physical world in motion, with the goal of training models that help people solve problems that require real-world interaction. Introducing Sora, our text-to-video model. Sora can generate videos up to a minute long while maintaining visual quality and adherence to the user’s prompt. Sora is a diffusion model, which generates a video by starting off with one that looks like static noise and gradually transforms it by removing the noise over many steps,” OpenAI stated in a blog. 

Although still at a developmental stage, in collaboration with policymakers, artists and people working in the education field, Sora aims to have applications in terms of solving problems prevalent among the large public. “Sora is able to generate complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background. The model understands not only what the user has asked for in the prompt, but also how those things exist in the physical world. The model has a deep understanding of language, enabling it to accurately interpret prompts and generate compelling characters that express vibrant emotions. Sora can also create multiple shots within a single generated video that accurately persist characters and visual style,” the blog mentioned.

Advertisment