Anthropic Looks to Fund Next-Gen AI Benchmarks

Anthropic introduces a funding program to support the creation of innovative benchmarks for assessing AI models’ performance and impact, including generative models like Claude.

Launched on Monday, the program will provide financial support to external organizations that can develop benchmarks to “effectively measure advanced capabilities in AI models.” as stated in Anthropic’s blog post. Applications will be accepted and reviewed on an ongoing basis.

Anthropic wrote on its official blog, “Our investment in these evaluations is intended to elevate the entire field of AI safety, providing valuable tools that benefit the whole ecosystem. Developing high-quality, safety-relevant evaluations remains challenging, and the demand is outpacing the supply.”

AI faces a significant benchmarking limitation. Current benchmarks fail to reflect how individuals typically interact with AI systems accurately. Moreover, some benchmarks, especially those developed before the advent of modern generative AI, may not effectively measure their intended targets, given their outdated nature.

Anthropic proposes a high-level solution: creating rigorous benchmarks focused on AI security and societal implications through innovative tools, infrastructure, and methods.

The company specifically calls for benchmarks that evaluate a model’s ability to perform tasks like simulating cyberattacks, “enhancing” weapons of mass destruction (such as nuclear weapons), and manipulating or deceiving individuals (through deepfakes or misinformation). For AI risks related to national security and defense, Anthropic commits to developing an “early warning system” to identify and assess potential risks without revealing the specifics of such a system in the blog post.

Anthropic also intends to utilize its new program to support research into benchmarks and “end-to-end” tasks that explore AI’s potential for aiding scientific discovery, conversing in multiple languages, mitigating deeply ingrained biases, and self-censoring harmful content.

To achieve this, Anthropic envisions developing new platforms that enable subject-matter experts to design their own evaluations and large-scale trials of AI models involving “thousands” of users. The company has hired a dedicated program coordinator and may acquire or expand projects showing scalability potential.

In the post, Anthropic writes, although an Anthropic spokesperson declined to provide any further details about those options, “We offer a range of funding options tailored to the needs and stage of each project. Teams will have the opportunity to interact directly with Anthropic’s domain experts from the frontier red team, fine-tuning, trust and safety, and other relevant teams.”

Anthropic’s initiative to support new AI benchmarks is commendable – assuming it is backed by sufficient resources and manpower. However, given the company’s commercial interests in the AI industry, its motives may be viewed with skepticism.

In the blog post, Anthropic is transparent about its intention to align funded evaluations with its AI safety classifications, which were developed with input from third parties like METR. While this is within the company’s rights, it may require applicants to adopt definitions of “safe” or “risky” AI that they may not entirely agree with.

Some members of the AI community may also object to Anthropic’s references to “catastrophic” and “deceptive” AI risks, such as nuclear weapons risks. Many experts argue that there is little evidence to suggest AI will become capable of surpassing human intelligence or posing existential risks anytime soon. They believe that claims of imminent “superintelligence” distract from pressing AI regulatory issues, like AI’s tendency to hallucinate.

Anthropic hopes its program will “catalyze progress towards comprehensive AI evaluation becoming an industry standard.” This mission aligns with open, corporate-unaffiliated efforts to create better AI benchmarks. However, whether these efforts will collaborate with an AI vendor that is ultimately accountable to its shareholders remains to be seen.