OpenAI to Harness Reddit Data for AI Growth

OpenAI has entered into an agreement with Reddit to utilize the social news site’s data to train its AI models.

According to a blog post on OpenAI’s press relations site, this partnership will grant OpenAI access to Reddit’s “real-time, structured, and unique content,” such as posts and replies. This will enable OpenAI’s tools and models to “better understand and showcase” Reddit content. Reddit content will be integrated into ChatGPT, OpenAI’s widely used conversational AI, and the two companies will collaborate to introduce new “AI-powered features” for Reddit users and moderators.

OpenAI will also become a key advertising partner with Reddit, further solidifying their collaboration and opening up new avenues for innovative, AI-driven advertising solutions on the platform.

OpenAI wrote in the post, “Reddit will be building on OpenAI’s platform of AI models to bring its powerful vision to life, Using LLMs, ML, and AI allow Reddit to improve the user experience for everyone.”

OpenAI has several similar licensing deals with content providers ranging from stock media libraries to news publishers. However, this particular partnership stands out because Sam Altman, OpenAI’s CEO, holds an 8.7% stake in Reddit, making him the third-largest shareholder and a former member of the company’s board of directors.

To discourage investigation, OpenAI clarified in its press release that while Sam Altman remains a Reddit shareholder, the partnership “was led by OpenAI’s COO Brad Lightcap” and “approved by OpenAI’s independent board of directors.” Notably, Altman is a member of OpenAI’s board but recused himself from this decision, according to an OpenAI spokesperson.

Reddit has strategically positioned data licensing agreements at the heart of its growth strategy as it guides the market as a public company.

In its IPO prospectus, Reddit unveiled that it has secured data licensing contracts with major players like Google, totaling over $200 million. This approach has significantly boosted its financial performance, with the first earnings report as a public company revealing a remarkable 450% year-over-year surge in non-ad revenue, largely driven by these agreements.

The news of the partnership with OpenAI sent Reddit’s stock soaring, with an impressive 11% increase in extended trading.

During the company’s earnings call in March, “The paradox I see is that, as more content on the internet is written by machines, there’s an increasing premium on content that comes from real people. And we have nearly two decades of authentic conversation.” said Reddit CEO Steve Huffman.

Reddit’s platform, boasting over 1 billion posts and more than 16 billion comments, is a treasure trove for generative AI companies. These models thrive on examples of content, such as text and images, to create new, similar material. With hundreds of millions of active users continually contributing, the data pool grows daily.

However, Reddit might encounter resistance from users who are worried about how their data will be monetized.

It’s enlightening to consider the case of Stack Overflow, the Q&A forum for software developers, which recently signed an agreement with OpenAI to provide data for model training. In response, some users protested by deleting their top-rated answers. Stack Overflow reacted by restoring the deleted posts and banning those users, citing non-compliance with its terms of service.

Reddit has similarly disapproved of efforts to give its users more control over their data.

Vana, a blockchain-based startup, is pioneering a novel approach by establishing a data “DAO” (Digital Autonomous Organization) to enable Reddit users to manage their data and collectively determine its usage or sale. However, Reddit took a firm stance against Vana’s initiative, banning the startup’s subreddit dedicated to discussing the DAO. In a statement, Reddit accused Vana of “exploiting” its data export controls.

As OpenAI and Reddit partner to harness the power of data, the WorkBot offers a solution for organizations to connect, automate, and unlock the potential of their data while maintaining privacy and security at the highest level—a crucial consideration as AI innovation advances.

Book a demo with our experts today to learn how it can also help your organization maximize the potential of AI and help you stay ahead of the competition.

Read More: OpenAI Plans to Introduce AI-Driven Search Product