OpenAI Says Its New AI Can Simulate Worlds

OpenAI made a huge splash this week with its text-to-photorealistic video AI called Sora. The company showed off some seriously impressive sample clips, from a couple walking through a snowy landscape to an airborne camera smoothly following a white vintage SUV as it makes its way up a dirt road. It certainly appears to be a considerable leap for generative AI technology — and perhaps in domains far beyond video. In fact, OpenAI is already referring to Sora as a “world simulator,” capable of understanding important aspects of the three-dimensional world around us, whether it’s outputting a CGI-like scene of a digital landscape or an video of a woman walking down a neon-lit street at night. “Our results suggest that scaling video generation models is a promising path towards building general purpose simulators of the physical world,” the company wrote. “It learns about 3D geometry and consistency,” Sora research scientist Tim Brooks told Wired. “We didn’t bake that in — it just entirely emerged from seeing a lot of data.” Broadly speaking, Sora is the natural evolution of a diffusion transformer model, which so far has mostly been used to AI-generate high-resolution images. In simple terms, diffusion models work by gradually adding noise to the original image and then progressively learning how to remove this noise, thereby creating a new image. To train Sora, OpenAI fed it huge amounts of captioned videos to establish a connection between video footage and text input. Apart from generating entirely new footage from prompts, Sora can also extend existing…OpenAI Says Its New AI Can Simulate Worlds

Leave a Reply

Your email address will not be published. Required fields are marked *