Cyber, EigenLayer, Sentient, RootData and others jointly launched the Crypto AI Benchmark Alliance, setting a new benchmark for Crypto AI

2025-06-04 10:47:27

Collection

By testing the model in real tasks, CAIBA establishes a unified and reproducible measurement standard for encrypted AI, helping the industry build more trustworthy intelligent applications.

Cyber, EigenLayer, Sentient, and 11 other blockchain and artificial intelligence projects today jointly announced the establishment of the Crypto AI Benchmark Alliance (CAIBA). This open-source, community-driven alliance will focus on creating transparent and trustworthy evaluation standards for AI models and agents in the crypto industry.

The first batch of founding members—Alchemy, Cyber, Dune, EigenLayer, Goldsky, IOSG, LazAI, Magic Newton, Metis, MyShell, OpenGradient, RootData, Sentient, and Thirdweb—will collaborate to contribute datasets, tools, and expertise to build the evaluation framework together. Each benchmark will include tasks, reference answers, and scoring scripts, and will be published on platforms like GitHub and Hugging Face under open licenses (where permitted).

As the application of AI in the crypto field continues to expand, covering everything from trading strategies to research assistants, traditional AI benchmarks have struggled to reflect the unique needs of the industry. CAIBA aims to fill this gap by launching specialized evaluations for crypto scenarios.

"Transparent and rigorous testing is crucial," said Ryan Li, co-founder of Cyber. "Models must not only answer questions correctly but also execute reliably, giving users more confidence in their decision-making."

The alliance's first achievement, a Benchmark for Crypto AI Agents (CAIA), is now live, measuring AI capabilities across three dimensions:

Knowledge: Accurately answering questions about protocols, tokens, etc.
Planning: Developing multi-step task plans.
Action: Performing operations using blockchain explorers and APIs.

CAIA covers scenarios such as token economics, on-chain analysis, project research, and trading processes, with evaluation subjects including general large models like GPT-4o, Claude 4, Gemini 2.5, DeepSeek-R1, as well as several crypto-native models.

By testing models on real tasks, CAIBA establishes a unified and reproducible measurement standard for crypto AI, helping the industry build more trustworthy intelligent applications. The alliance is developing more benchmarks and welcomes new members to join. Developers, researchers, and protocol teams can submit models for evaluation or propose new tasks.

About Crypto AI Benchmark Alliance (CAIBA)

The Crypto AI Benchmark Alliance is a community-governed open alliance focused on establishing AI evaluation standards for crypto scenarios. Through open datasets, reproducible tasks, and public leaderboards, CAIBA provides tools for developers, researchers, and protocols to measure and improve AI systems in blockchain applications. For more details, please visit caiba.ai.

Related tags

Cyber EigenLayer Sentient AI