Shy Kids Explains Sora-powered AI-Generated Short’s Features and Constraints

OpenAI has recently introduced an innovative video generation tool named Sora, which is poised to revolutionize the filmmaking process. This cutting-edge technology allows users to generate high-quality video content by simply inputting textual descriptions, opening new horizons for creativity and efficiency in film production. The early access to Sora provided to a select group of filmmakers has already yielded promising feedback, highlighting its potential to significantly impact the industry.

A digital production team based in Toronto, Shy Kids was picked by OpenAI as one of a few to produce short films for OpenAI promotional purposes, notably “air head”. Despite the promotional context, Shy Kids enjoyed significant creative autonomy in the production process. In a conversation with the visual effects news outlet fxguide, post-production artist Patrick Cederberg detailed his experience incorporating “Sora” into their workflow.

Shy Kids shared their experience, emphasizing the tool’s transformative impact on their creative process. They highlighted several key benefits in their statements:

  • Efficiency: The ability to quickly generate video content from text descriptions significantly reduces production times.
  • Cost-Effectiveness: Sora minimizes the need for extensive production crews and equipment, thereby lowering the overall cost of filmmaking.
  • Innovative Storytelling: The tool enables the exploration of new storytelling techniques, as creators can easily produce and iterate on complex visual ideas.

Impact on the Filmmaking Process:

  • Accessibility: Sora makes the filmmaking process more accessible to a wider range of creators, including those with limited budgets or technical expertise.
  • Collaboration: It facilitates collaboration among different creative roles, as the initial video drafts can be easily shared and refined.
  • Future of Filmmaking: The tool represents a significant step towards democratizing film production, encouraging more diverse and innovative content.

Limitations of Sora:

While we got an overview of the features, let’s dive into the limitations of Sora and AI video generation as a whole.

The key realization for many is that while OpenAI’s promotional materials might suggest that the short films emerged seamlessly from Sora, the truth is far more complex. These productions were meticulously crafted by professionals, involving comprehensive storyboarding, meticulous editing, color correction, and post-production work such as rotoscoping and visual effects. Similar to Apple’s “shot on iPhone” campaigns, which showcase the end result without revealing the behind-the-scenes effort, the focus of the Sora promotion is on its capabilities rather than the intricate processes involved in creating the final output.

In traditional filmmaking, simple tasks like selecting a character’s clothing color are straightforward. However, in a generative system like Sora, each shot is generated independently, requiring elaborate workarounds and checks. While this dynamic may evolve over time, the current process remains labor-intensive.

Moreover, Sora outputs necessitate vigilant oversight for unwanted elements. Cederberg recounted instances where the model generated unexpected elements, such as a face on the character’s balloon head or a string hanging down the front. These anomalies had to be manually removed in post-production, adding to the time and effort invested in the project.

Sora’s initial shot versus final short film    
Image Credits: Shy Kids

Currently, precise timing and movements of characters or the camera aren’t really possible, Cederberg said: “There’s a little bit of temporal control about where these different actions happen in the actual generation, but it’s not precise … it’s kind of a shot in the dark.”

Timing gestures, such as a wave, within Sora is an approximate and suggestion-driven process, contrasting with the precision of manual animations. Moreover, achieving specific camera movements, like an upward pan on a character’s body, may result in outcomes that diverge from the filmmaker’s intentions. Consequently, the team opted to render shots in portrait orientation and apply crop pan during post-production. Additionally, the generated clips often featured inexplicable slow-motion effects. 

Surprisingly, common filmmaking terms like “panning right” or “tracking shot” exhibited inconsistency within Sora, a revelation that caught the team off guard.

“The researchers, before they approached artists to play with the tool, hadn’t really been thinking like filmmakers,” He said.

Consequently, the team underwent numerous iterations, each lasting 10 to 20 seconds, ultimately selecting only a fraction for the final production. Cederberg estimated the ratio at 300 iterations to 1 final clip—a stark contrast to the typical ratio in conventional filmmaking endeavors.

The team also created a behind-the-scenes video detailing the challenges they encountered during the production process, providing insight into their experiences. As is common with AI-related content, the comments were largely critical of the project, although not as vehemently as seen with recent AI-assisted advertisements that faced significant backlash.

An intriguing aspect concerns copyright: Sora refuses to generate content resembling copyrighted material, such as a “Star Wars” clip or a “robed man with a laser sword on a retro-futuristic spaceship,” even when described in alternative terms. It even refused to do an “Aronofsky type shot” or a “Hitchcock zoom”. This refusal suggests that Sora possesses some mechanism for recognizing potentially infringing content. 

While this approach aligns with copyright laws, it raises questions about the model’s training data and the extent to which it was exposed to copyrighted material. OpenAI’s guarded approach to sharing training data, exemplified by CTO Mira Murati’s interview with Joanna Stern, adds to the mystery surrounding Sora’s capabilities and training process.

Regarding Sora’s role in filmmaking, it undeniably serves as a potent and valuable tool within its designated sphere. However, it falls short of being capable of “creating films out of whole cloth”—at least for now. As the famous villain once remarked, “that comes later,” implying that while Sora holds immense potential, its current capabilities are not yet on par with fully autonomous film production.

Read More: ChatGPT Faces Another EU Privacy Complaint for ‘Hallucination’ Issue