IOSG Weekly Brief｜Imagining the Robotics Industry: The Evolution of Automation, Artificial Intelligence, and Web3 Integration

Core Viewpoint

Notes on Extensive Knowledge

2025-11-18 16:15:45

Collection

The Dawn of Embodied Intelligence: Web3 Seeks Entry Points for the Reconstruction of Production Relations

Author｜Jacob Zhao @IOSG

Robot Panorama: From Industrial Automation to Humanoid Intelligence

The traditional robotics industry chain has formed a complete hierarchical system from the bottom up, covering four major links: core components --- intermediate control systems --- complete machine manufacturing --- application integration. Core components (controllers, servos, reducers, sensors, batteries, etc.) have the highest technological barriers, determining the performance and cost limits of the complete machine; the control system is the "brain and small brain" of the robot, responsible for decision-making planning and motion control; complete machine manufacturing reflects the ability to integrate the supply chain. System integration and application determine the depth of commercialization, which is becoming the new core of value.

According to application scenarios and forms, global robotics is evolving along the path of "industrial automation → scene intelligence → general intelligence," forming five major types: industrial robots, mobile robots, service robots, special robots, and humanoid robots.

Industrial Robots Currently the only fully mature track, widely used in manufacturing processes such as welding, assembly, painting, and handling. The industry has formed a standardized supply chain system, with stable gross margins and clear ROI. Among them, the subclass of collaborative robots (Cobots) emphasizes human-machine collaboration and is lightweight and easy to deploy, growing the fastest.

Representative companies: ABB, Fanuc, Yaskawa, KUKA, Universal Robots, Jaka, Aubo.

Mobile Robots Including AGVs (Automated Guided Vehicles) and AMRs (Autonomous Mobile Robots), which have been widely deployed in logistics warehousing, e-commerce delivery, and manufacturing transportation, becoming the most mature category for B-end applications.

Representative companies: Amazon Robotics, Geek+, Quicktron, Locus Robotics.

Service Robots Targeting industries such as cleaning, catering, hospitality, and education, this is the fastest-growing area on the consumer side. Cleaning products have entered the realm of consumer electronics, and medical and commercial delivery are accelerating commercialization. Additionally, a number of more general-purpose operational robots are emerging (such as Dyna's dual-arm system) — more flexible than task-specific products but not yet achieving the generality of humanoid robots.

Representative companies: Ecovacs, Roborock, PuduTech, Qianlong Intelligent, iRobot, Dyna, etc.

Special Robots Mainly serving scenarios in medical, military, construction, marine, and aerospace, the market size is limited but profit margins are high and barriers are strong, largely relying on government and enterprise orders, and are in a vertically segmented growth stage. Typical projects include intuitive surgery, Boston Dynamics, ANYbotics, NASA Valkyrie, etc.

Humanoid Robots Regarded as the future "general labor platform."

Representative companies: Tesla (Optimus), Figure AI (Figure 01), Sanctuary AI (Phoenix), Agility Robotics (Digit), Apptronik (Apollo), 1X Robotics, Neura Robotics, Unitree, UBTECH, Zhiyuan Robotics, etc.

Humanoid robots are currently the most focused frontier direction, with their core value lying in adapting to existing social spaces with a humanoid structure, seen as a key form toward the "general labor platform." Unlike industrial robots that pursue extreme efficiency, humanoid robots emphasize general adaptability and task transferability, capable of entering factories, homes, and public spaces without modifying the environment.

Currently, most humanoid robots remain in the technical demonstration stage, mainly validating dynamic balance, walking, and operational capabilities. Although some projects have begun small-scale deployment in highly controlled factory scenarios (such as Figure × BMW, Agility Digit), and more manufacturers (such as 1X) are expected to enter early distribution starting in 2026, these are still limited applications of "narrow scenarios and single tasks," rather than true general labor deployment. Overall, it will take several more years to achieve scaled commercialization. Core bottlenecks include: control challenges such as multi-degree-of-freedom coordination and real-time dynamic balance; energy consumption and endurance issues constrained by battery energy density and drive efficiency; perception-decision links that are prone to instability and difficult to generalize in open environments; significant data gaps (difficult to support general strategy training); cross-body transfer yet to be conquered; and hardware supply chain and cost curves (especially outside of China) still pose real barriers, making the realization of large-scale, low-cost deployment even more challenging.

The future commercialization path is expected to go through three stages: short-term dominated by Demo-as-a-Service, relying on pilot projects and subsidies; mid-term evolving into Robotics-as-a-Service (RaaS), building a task and skill ecosystem; and long-term focusing on labor clouds and intelligent subscription services, shifting the value center from hardware manufacturing to software and service networks. Overall, humanoid robots are in a critical transition period from demonstration to self-learning, and whether they can overcome the triple barriers of control, cost, and algorithms will determine if they can truly achieve embodied intelligence.

AI × Robotics: The Dawn of the Era of Embodied Intelligence

Traditional automation mainly relies on pre-programming and assembly line control (such as the perception-planning-control DSOP architecture), which can only operate reliably in structured environments. However, the real world is much more complex and variable. The new generation of embodied intelligence (Embodied AI) follows a different paradigm: through large models and unified representation learning, enabling robots to possess cross-scenario "understanding-prediction-action" capabilities. Embodied intelligence emphasizes the dynamic coupling of body (hardware) + brain (model) + environment (interaction), where the robot is the carrier, and intelligence is the core.

Generative AI belongs to the intelligence of the language world, excelling at understanding symbols and semantics; embodied intelligence belongs to the intelligence of the real world, mastering perception and action. The two correspond to "brain" and "body," representing two parallel main lines of AI evolution. From an intelligence hierarchy perspective, embodied intelligence is a higher level than generative AI, but its maturity still lags significantly. LLMs rely on vast amounts of internet data to form a clear "data → computing power → deployment" closed loop; whereas robotic intelligence requires first-person, multi-modal, and action-bound data — including remote control trajectories, first-person videos, spatial maps, and operational sequences — which do not naturally exist and must be generated through real interactions or high-fidelity simulations, making them scarcer and more expensive. Although simulation and synthetic data are helpful, they cannot replace real sensors — motion experience, which is why Tesla, Figure, and others must build their own remote operation data factories, and why third-party data annotation factories have emerged in Southeast Asia. In short: LLMs learn from ready-made data, while robots must "create" data through interaction with the physical world. In the next 5-10 years, the two will deeply integrate in the Vision-Language-Action model and Embodied Agent architecture — LLMs will handle high-level cognition and planning, while robots will execute in the real world, forming a bidirectional closed loop of data and action, jointly promoting AI from "language intelligence" to true general intelligence (AGI).

The core technology system of embodied intelligence can be seen as a bottom-up intelligence stack: VLA (perception fusion), RL/IL/SSL (intelligent learning), Sim2Real (reality transfer), World Model (cognitive modeling), and multi-agent collaboration and memory reasoning (Swarm & Reasoning). Among them, VLA and RL/IL/SSL are the "engines" of embodied intelligence, determining its landing and commercialization; Sim2Real and World Model are key technologies connecting virtual training and real execution; multi-agent collaboration and memory reasoning represent higher-level collective and metacognitive evolution.

Perception Understanding: Vision-Language-Action Model

The VLA model integrates three channels: Vision --- Language --- Action, enabling robots to understand intentions from human language and translate them into specific operational behaviors. Its execution process includes semantic parsing, target recognition (locating target objects from visual input), and path planning and action execution, thus achieving a closed loop of "understanding semantics --- perceiving the world --- completing tasks," which is one of the key breakthroughs in embodied intelligence. Current representative projects include Google RT-X, Meta Ego-Exo, and Figure Helix, showcasing cutting-edge directions such as cross-modal understanding, immersive perception, and language-driven control.

Currently, VLA is still in its early stages, facing four core bottlenecks:

Semantic ambiguity and weak task generalization: models struggle to understand vague, open-ended instructions;
Unstable alignment between vision and action: perceptual errors are amplified in path planning and execution;
Scarcity and lack of unified standards for multi-modal data: high costs of collection and annotation make it difficult to form a scalable data flywheel;
Challenges of long-duration tasks in both time and space: overly long task spans lead to insufficient planning and memory capabilities, while overly large spatial ranges require models to reason about things "beyond the field of view," and current VLA lacks stable world models and cross-space reasoning capabilities.

These issues collectively limit the cross-scenario generalization ability and scalable landing process of VLA.

Intelligent Learning: Self-Supervised Learning (SSL), Imitation Learning (IL), and Reinforcement Learning (RL)

Self-Supervised Learning (SSL): Automatically extracts semantic features from perceptual data, allowing robots to "understand the world." It is akin to teaching machines to observe and represent.
Imitation Learning (IL): Quickly masters basic skills by mimicking human demonstrations or expert examples. It is akin to teaching machines to act like humans.
Reinforcement Learning (RL): Through a "reward-punishment" mechanism, robots optimize action strategies through continuous trial and error. It is akin to teaching machines to grow through trial and error.

In embodied intelligence, self-supervised learning (SSL) aims to enable robots to predict state changes and physical laws through perceptual data, thereby understanding the causal structure of the world; reinforcement learning (RL) is the core engine of intelligence formation, driving robots to master complex behaviors such as walking, grasping, and obstacle avoidance through interaction with the environment and trial-and-error optimization based on reward signals; imitation learning (IL) accelerates this process through human demonstrations, allowing robots to quickly acquire action priors. The current mainstream direction is to combine all three to build a hierarchical learning framework: SSL provides the representation foundation, IL imparts human priors, and RL drives strategy optimization, balancing efficiency and stability, together forming the core mechanism of embodied intelligence from understanding to action.

Reality Transfer: Sim2Real — Bridging Simulation to Reality

Sim2Real (Simulation to Reality) allows robots to complete training in virtual environments and then transfer to the real world. It generates large-scale interactive data through high-fidelity simulation environments (such as NVIDIA Isaac Sim & Omniverse, DeepMind MuJoCo), significantly reducing training costs and hardware wear. The core lies in narrowing the "simulation-reality gap," with main methods including:

Domain Randomization: Randomly adjusting parameters such as lighting, friction, and noise in simulations to enhance model generalization;
Physical Consistency Calibration: Using real sensor data to calibrate the simulation engine, enhancing physical realism;
Adaptive Fine-tuning: Rapid retraining in real environments for stable transfer.

Sim2Real is the central link for the landing of embodied intelligence, enabling AI models to learn the closed loop of "perception-decision-control" in a safe, low-cost virtual world. Sim2Real has matured in simulation training (such as NVIDIA Isaac Sim, MuJoCo), but reality transfer is still limited by the Reality Gap, high computing power and annotation costs, and insufficient generalization and safety in open environments. Nevertheless, Simulation-as-a-Service (SimaaS) is becoming the lightest yet most strategically valuable infrastructure in the era of embodied intelligence, with business models including platform subscriptions (PaaS), data generation (DaaS), and security verification (VaaS).

Cognitive Modeling: World Model — The "Inner World" of Robots

World Model is the "inner brain" of embodied intelligence, allowing robots to internally simulate environments and action consequences, achieving prediction and reasoning. By learning the dynamic laws of the environment, it constructs predictable internal representations, enabling agents to "rehearse" outcomes before execution, evolving from passive executors to active reasoners. Representative projects include DeepMind Dreamer, Google Gemini + RT-2, Tesla FSD V12, NVIDIA WorldSim, etc. Typical technical paths include:

Latent Dynamics Modeling: Compressing high-dimensional perception into latent state space;
Imagination-based Planning: Virtually trial-and-error and path prediction within the model;
Model-based RL: Using world models to replace real environments, reducing training costs.

World Model is at the theoretical frontier of embodied intelligence, representing the core path for robots to evolve from "reactive" to "predictive" intelligence, but still faces challenges such as modeling complexity, unstable long-term predictions, and lack of unified standards.

Collective Intelligence and Memory Reasoning: From Individual Action to Collaborative Cognition

Multi-Agent Systems and Memory & Reasoning represent two important directions in the evolution of embodied intelligence from "individual intelligence" to "collective intelligence" and "cognitive intelligence." Together, they support the collaborative learning and long-term adaptability of intelligent systems.

Multi-Agent Collaboration (Swarm / Cooperative RL): Refers to multiple agents achieving collaborative decision-making and task allocation through distributed or cooperative reinforcement learning in a shared environment. This direction has a solid research foundation, for example, the OpenAI Hide-and-Seek experiment demonstrated spontaneous cooperation and strategy emergence among multiple agents, while DeepMind's QMIX and MADDPG algorithms provide a cooperative framework for centralized training and decentralized execution. Such methods have been validated in scenarios such as warehouse robot scheduling, inspection, and swarm control.

Memory and Reasoning: Focuses on enabling agents to possess long-term memory, contextual understanding, and causal reasoning capabilities, which are key directions for achieving cross-task transfer and self-planning. Typical research includes DeepMind Gato (a multi-task agent unifying perception-language-control) and the DeepMind Dreamer series (imagination-based planning based on world models), as well as open-ended embodied agents like Voyager, which achieve continuous learning through external memory and self-evolution. These systems lay the foundation for robots to have the ability to "remember the past and predict the future."

Global Landscape of the Embodied Intelligence Industry: Cooperation and Competition Coexist

The global robotics industry is currently in a period of "cooperation dominance and deepening competition." China's supply chain efficiency, the U.S.'s AI capabilities, Japan's precision in components, and Europe's industrial standards collectively shape the long-term landscape of the global robotics industry.

The U.S. maintains a lead in cutting-edge AI models and software (DeepMind, OpenAI, NVIDIA), but this advantage has not extended to robotic hardware. Chinese manufacturers have advantages in iteration speed and real-world performance. The U.S. is promoting industrial return through the CHIPS Act and the Inflation Reduction Act.
China has formed a leading advantage in components, automated factories, and humanoid robots through large-scale manufacturing, vertical integration, and policy-driven initiatives, with outstanding hardware and supply chain capabilities. Companies like Unitree and UBTECH have achieved mass production and are extending toward intelligent decision-making layers. However, there remains a significant gap with the U.S. in algorithm and simulation training layers.
Japan has long monopolized high-precision components and motion control technologies, with a robust industrial system, but the integration of AI models is still in the early stages, and the pace of innovation is relatively stable.
South Korea stands out in the popularization of consumer-grade robots, led by companies like LG and NAVER Labs, and has a mature and strong service robot ecosystem.
Europe has a well-established engineering system and safety standards, with companies like 1X Robotics remaining active in R&D, but some manufacturing processes have migrated, and the focus of innovation has shifted toward collaboration and standardization.

Robotics × AI × Web3: Narrative Vision and Realistic Pathways

By 2025, a new narrative of integration between Web3, robotics, and AI is emerging. Although Web3 is seen as the underlying protocol for a decentralized machine economy, its combined value and feasibility at different levels still show significant differentiation:

In hardware manufacturing and service layers, capital-intensive and weak data closed loops mean that Web3 can currently only play a supportive role in marginal areas such as supply chain finance or equipment leasing;
The fit at the simulation and software ecosystem layer is higher, as simulation data and training tasks can be on-chain for rights verification, and agents and skill modules can also be tokenized through NFTs or Agent Tokens;
At the platform layer, decentralized labor and collaboration networks are showing the greatest potential — Web3 can gradually build a trustworthy "machine labor market" through integrated mechanisms of identity, incentives, and governance, laying the institutional groundwork for the future machine economy.

From a long-term vision perspective, the collaboration and platform layers are the most valuable directions for the integration of Web3 with robotics and AI. As robots gradually acquire perception, language, and learning capabilities, they are evolving into intelligent entities capable of autonomous decision-making, collaboration, and creating economic value. For these "intelligent laborers" to truly participate in the economic system, they still need to overcome four core thresholds of identity, trust, incentives, and governance.

At the identity layer, machines need to have verifiable and traceable digital identities. Through Machine DID, each robot, sensor, or drone can generate a unique verifiable "ID card" on-chain, binding its ownership, behavior records, and permission scope, enabling secure interactions and responsibility delineation.
At the trust layer, the key is to make "machine labor" verifiable, measurable, and priceable. By leveraging smart contracts, oracles, and auditing mechanisms, combined with physical work proofs (PoPW), trusted execution environments (TEE), and zero-knowledge proofs (ZKP), the authenticity and traceability of the task execution process can be ensured, giving economic accounting value to machine behavior.
At the incentive layer, Web3 achieves automatic settlement and value transfer between machines through token incentive systems, account abstraction, and state channels. Robots can complete computing power leasing and data sharing through micropayments, with staking and penalty mechanisms ensuring task fulfillment; with the help of smart contracts and oracles, a decentralized "machine collaboration market" can also be formed without human scheduling.
At the governance layer, when machines possess long-term autonomy, Web3 provides a transparent and programmable governance framework: DAO governance for jointly deciding system parameters, and multi-signature and reputation mechanisms to maintain safety and order. In the long run, this will push the machine society toward the "algorithmic governance" stage — humans set goals and boundaries, while machines maintain incentives and balance through contracts.

The ultimate vision of the integration of Web3 and robotics: a real-world evaluation network — a "real-world reasoning engine" composed of distributed robots, continuously testing and benchmarking model capabilities in diverse and complex physical scenarios; and a robot labor market — robots executing verifiable real-world tasks globally, earning income through on-chain settlements, and reinvesting value into computing power or hardware upgrades.

From a realistic pathway perspective, the combination of embodied intelligence and Web3 is still in the early exploration phase, with decentralized machine intelligence economies remaining more at the narrative and community-driven level. The directions with feasible potential in reality mainly reflect in the following three aspects:

(1) Data crowdsourcing and rights verification — Web3 encourages contributors to upload real-world data through on-chain incentives and traceability mechanisms;
(2) Global long-tail participation — cross-border micropayments and micro-incentive mechanisms effectively reduce data collection and distribution costs;
(3) Financialization and collaborative innovation — DAO models can promote the tokenization of robot assets, yield certificates, and settlement mechanisms between machines.

Overall, the short term is mainly focused on data collection and incentive layers; the mid-term is expected to achieve breakthroughs in "stablecoin payments + long-tail data aggregation" and RaaS assetization and settlement layers; in the long term, if humanoid robots achieve large-scale popularization, Web3 may become the institutional foundation for machine ownership, income distribution, and governance, promoting the formation of a truly decentralized machine economy.

Web3 Robotics Ecosystem Map and Selected Cases

Based on three criteria: "verifiable progress, technological openness, and industry relevance," we have sorted current representative projects of Web3 × Robotics and categorized them into five layers: Model Intelligence Layer, Machine Economy Layer, Data Collection Layer, Perception and Simulation Infrastructure Layer, and Robot Asset Yield Layer. To maintain objectivity, we have excluded projects that are clearly "jumping on the bandwagon" or lack sufficient information; if there are omissions, please feel free to correct us.

Model Intelligence Layer

Openmind - Building Android for Robots (https://openmind.org/)

OpenMind is an open-source operating system (Robot OS) for embodied intelligence (Embodied AI) and robot control, aiming to build the world's first decentralized robot operating environment and development platform. The core of the project includes two major components:

OM1: A modular open-source AI runtime (AI Runtime Layer) built on ROS2, used to orchestrate perception, planning, and action pipelines, serving both digital and physical robots;
FABRIC: A distributed coordination layer (Fabric Coordination Layer) that connects cloud computing power, models, and real robots, allowing developers to control and train robots in a unified environment.

The core of OpenMind is to act as an intelligent intermediary layer between LLMs (large language models) and the world of robots, allowing language intelligence to truly transform into embodied intelligence, constructing an intelligent framework from understanding (Language → Action) to alignment (Blockchain → Rules).

The multi-layer system of OpenMind achieves a complete collaborative closed loop: humans provide feedback and annotations through the OpenMind App (RLHF data), the Fabric Network is responsible for identity verification, task allocation, and settlement coordination, and OM1 Robots execute tasks while adhering to the "robot constitution" on the blockchain for behavior auditing and payment, thus realizing a decentralized machine collaboration network of human feedback → task collaboration → on-chain settlement.

Project Progress and Reality Assessment

OpenMind is in the early stage of "technology operational, commercialization not yet landed." The core system OM1 Runtime has been open-sourced on GitHub, can run on multiple platforms, and supports multi-modal input, achieving task understanding from language to action through a natural language data bus (NLDB), with high originality but still experimental; the Fabric network and on-chain settlement have only completed interface layer design.

In terms of ecology, the project has collaborated with open hardware such as Unitree, Ubtech, TurtleBot, and universities such as Stanford, Oxford, and Seoul Robotics, mainly for educational and research validation, with no industrial landing yet. The app has launched a test version, but the incentive and task functions are still in the early stages.

In terms of business model, OpenMind has built a three-layer ecosystem of OM1 (open-source system) + Fabric (settlement protocol) + Skill Marketplace (incentive layer), currently with no revenue, relying on approximately $20 million in early financing (Pantera, Coinbase Ventures, DCG). Overall, the technology is leading, but commercialization and ecology are still in the early stages. If Fabric successfully lands, it is expected to become "the Android of the embodied intelligence era," but the cycle is long, risks are high, and it is heavily dependent on hardware.

CodecFlow - The Execution Engine for Robotics (https://codecflow.ai)

CodecFlow is a decentralized execution layer protocol (Fabric) based on the Solana network, aimed at providing on-demand operating environments for AI agents and robotic systems, allowing each agent to have an "Instant Machine." The core of the project consists of three major modules:

Fabric: A cross-cloud computing aggregation layer (Weaver + Shuttle + Gauge) that can generate secure virtual machines, GPU containers, or robot control nodes for AI tasks within seconds;
optr SDK: An agent execution framework (Python interface) for creating operable desktops, simulations, or real robots as "Operators";
Token incentives: An on-chain incentive and payment layer connecting computing providers, agent developers, and automated task users, forming a decentralized computing and task market.

CodecFlow's core goal is to create a "decentralized execution base for AI and robot operators," allowing any agent to run safely in any environment (Windows / Linux / ROS / MuJoCo / robot controllers), achieving a universal execution architecture from computing power scheduling (Fabric) → system environment (System Layer) → perception and action (VLA Operator).

Project Progress and Reality Assessment

Early versions of the Fabric framework (Go) and optr SDK (Python) have been released, allowing isolated computing instances to be launched in web or command-line environments. The Operator market is expected to launch by the end of 2025, positioned as the decentralized execution layer for AI computing power, primarily serving AI developers, robotics research teams, and automated operation companies.

Machine Economy Layer

BitRobot - The World's Open Robotics Lab (https://bitrobot.ai)

BitRobot is a decentralized research and collaboration network (Open Robotics Lab) focused on embodied intelligence (Embodied AI) and robotics research, jointly initiated by FrodoBots Labs and Protocol Labs. Its core vision is to define and verify the true contributions of each robotic task through an open architecture of "subnets + incentive mechanisms + verifiable work (VRW)," with core functions including:

Defining and verifying each robotic task's true contribution through the VRW (Verifiable Robotic Work) standard;
Endowing robots with on-chain identities and economic responsibilities through ENT (Embodied Node Token);
Organizing cross-regional collaboration in research, computing power, equipment, and operators through Subnets;
Achieving "human-machine co-governance" incentive decision-making and research governance through Senate + Gandalf AI.

Since the release of its white paper in 2025, BitRobot has operated multiple subnets (such as SN/01 ET Fugi, SN/05 SeeSaw by Virtuals Protocol), achieving decentralized remote control and real-world data collection, and launched the $5M Grand Challenges fund to promote global model development research competitions.

peaq -- The Economy of Things (https://www.peaq.network)

peaq is a Layer-1 blockchain designed for the machine economy, providing machine identities, on-chain wallets, access control, and nanosecond-level time synchronization (Universal Machine Time) for millions of robots and devices. Its Robotics SDK enables developers to make robots "machine economy ready" with minimal code, achieving interoperability and interaction across vendors and systems.

Currently, peaq has launched the world's first tokenized robot farm and supports over 60 real-world machine applications. Its tokenization framework helps robotics companies raise funds for capital-intensive hardware and expands participation from traditional B2B/B2C to a broader community level. With a protocol-level incentive pool injected by network fees, peaq can subsidize new device access and support developers, forming an economic flywheel that accelerates the expansion of robotics and physical AI projects.

Data Collection Layer

Aims to solve the scarcity and high cost of high-quality real-world data in training embodied intelligence. It collects and generates human-machine interaction data through various paths, including remote control (PrismaX, BitRobot Network), first-person and motion capture (Mecka, BitRobot Network, Sapien, Vader, NRN), as well as simulation and synthetic data (BitRobot Network), providing a scalable and generalizable training foundation for robot models. It is important to clarify that Web3 is not good at "producing data" — in hardware, algorithms, and collection efficiency, Web2 giants far exceed any DePIN project. Its real value lies in reshaping the distribution and incentive mechanisms of data. Based on a "stablecoin payment network + crowdsourcing model," it achieves low-cost micropayments, contribution traceability, and automatic profit-sharing through permissionless incentive systems and on-chain rights verification. However, open crowdsourcing still faces challenges of quality and demand closure — data quality is uneven, lacking effective verification and stable buyers.

PrismaX (https://gateway.prismax.ai)

PrismaX is a decentralized remote control and data economy network aimed at embodied intelligence (Embodied AI), designed to build a "global robot labor market," allowing human operators, robotic devices, and AI models to co-evolve through an on-chain incentive system. The core of the project includes two major components:

Teleoperation Stack — A remote control system (browser/VR interface + SDK) that connects global robotic arms and service robots, enabling real-time human control and data collection;
Eval Engine — A data evaluation and verification engine (CLIP + DINOv2 + optical flow semantic scoring) that generates quality scores for each operation trajectory and settles on-chain.

PrismaX transforms human operational behavior into machine learning data through a decentralized incentive mechanism, constructing a complete closed loop from remote control → data collection → model training → on-chain settlement, achieving a circular economy where "human labor is data assets." Project Progress and Reality Assessment

PrismaX launched its test version (gateway.prismax.ai) in August 2025, allowing users to remotely control robotic arms to perform grasping experiments and generate training data. The Eval Engine is already running internally. Overall, PrismaX has a high degree of technical realization and a clear positioning, serving as a key intermediary layer connecting "human operation × AI model × blockchain settlement." Its long-term potential is expected to become a "decentralized labor and data protocol in the era of embodied intelligence," but in the short term, it still faces scaling challenges.

BitRobot Network (https://bitrobot.ai/)

BitRobot Network collects multi-source data such as video, remote control, and simulation through its subnets. SN/01 ET Fugi allows users to remotely control robots to complete tasks, collecting navigation and perception data in a "real-life Pokémon Go-style" interaction. This approach led to the creation of the FrodoBots-2K dataset, one of the largest open-source human-robot navigation datasets currently used by institutions like UC Berkeley RAIL and Google DeepMind. SN/05 SeeSaw (Virtual Protocol) crowdsources first-person video data on a large scale in real environments through iPhones. Other announced subnets, such as RoboCap and Rayvo, focus on using low-cost physical devices to collect first-person video data.

Mecka (https://www.mecka.ai)

Mecka is a robotics data company that crowdsources first-person video, human motion data, and task demonstrations through gamified mobile collection and customized hardware devices to build large-scale multimodal datasets that support the training of embodied intelligence models.

Sapien (https://www.sapien.io/)

Sapien is a crowdsourcing platform centered on "human motion data driving robot intelligence," collecting human action, posture, and interaction data through wearable devices and mobile applications for training embodied intelligence models. The project aims to build the world's largest human motion data network, making human natural behavior the foundational data source for robot learning and generalization.

Vader (https://www.vaderai.ai)

Vader crowdsources first-person video and task demonstrations through its real-world MMO application EgoPlay: users record daily activities from a first-person perspective and earn $VADER rewards. Its ORN data pipeline can convert raw POV footage into privacy-processed structured datasets, including action labels and semantic descriptions, which can be directly used for humanoid robot strategy training.

NRN Agents (https://www.nrnagents.ai/)

A gamified embodied RL data platform that crowdsources human demonstration data through browser-based robot control and simulation competitions. NRN generates long-tail behavior trajectories through "competitive" tasks for imitation learning and continuous reinforcement learning, serving as scalable data primitives to support sim-to-real strategy training.

Comparison of Projects in the Embodied Intelligence Data Collection Layer

Perception and Simulation (Middleware & Simulation)

The perception and simulation layer provides the core infrastructure connecting the physical world with intelligent decision-making for robots, including capabilities for localization, communication, spatial modeling, and simulation training, serving as the "intermediate layer skeleton" for building large-scale embodied intelligence systems. Currently, this field is still in the early exploration stage, with various projects forming differentiated layouts in high-precision localization, shared spatial computing, protocol standardization, and distributed simulation, and no unified standards or interoperable ecosystems have emerged.

Middleware and Spatial Infrastructure

The core capabilities of robots — navigation, localization, connectivity, and spatial modeling — form the key bridge connecting the physical world with intelligent decision-making. Although broader DePIN projects (Silencio, WeatherXM, DIMO) have begun to mention "robots," the following projects are most directly related to embodied intelligence.

RoboStack — Cloud-Native Robot Operating Stack (https://robostack.io)

RoboStack is a cloud-native robot middleware that achieves real-time scheduling, remote control, and cross-platform interoperability of robot tasks through RCP (Robot Context Protocol), providing cloud simulation, workflow orchestration, and agent access capabilities.

GEODNET — Decentralized GNSS Network (https://geodnet.com)

GEODNET is a global decentralized GNSS network providing centimeter-level RTK high-precision positioning. Through distributed base stations and on-chain incentives, it offers real-time "geographic reference layers" for drones, autonomous driving, and robots.

Auki — Posemesh for Spatial Computing (https://www.auki.com)

Auki builds a decentralized Posemesh spatial computing network, generating real-time 3D environmental maps through crowdsourced sensors and computing nodes, providing shared spatial benchmarks for AR, robot navigation, and multi-device collaboration. It is a key infrastructure connecting virtual spaces with real-world scenarios, promoting the integration of AR × Robotics.

Tashi Network — Real-time Collaborative Network for Robots (https://tashi.network)

A decentralized real-time mesh network that achieves sub-30ms consensus, low-latency sensor exchange, and multi-robot state synchronization. Its MeshNet SDK supports shared SLAM, collective collaboration, and robust map updates, providing a high-performance real-time collaboration layer for embodied AI.

Staex — Decentralized Connectivity and Telemetry Network (https://www.staex.io)

A decentralized connectivity layer originating from the German telecommunications R&D department, providing secure communication, trusted telemetry, and routing capabilities from devices to the cloud, enabling reliable data exchange and collaboration across different operators for robot fleets.

Simulation and Learning Systems (Distributed Simulation & Learning)

Gradient - Towards Open Intelligence (https://gradient.network/)

Gradient is an AI laboratory building "Open Intelligence," dedicated to achieving distributed training, inference, validation, and simulation based on decentralized infrastructure; its current tech stack includes Parallax (distributed inference), Echo (distributed reinforcement learning and multi-agent training), and Gradient Cloud (AI solutions for enterprises). In the robotics direction, the Mirage platform provides distributed simulation, dynamic interactive environments, and large-scale parallel learning capabilities for training world models and general strategies. Mirage is exploring potential collaboration with NVIDIA regarding its Newton engine.

Robot Asset Yield Layer (RobotFi / RWAiFi)

This layer focuses on the key link of transforming robots from "productive tools" to "financializable assets," building the financial infrastructure of the machine economy through asset tokenization, yield distribution, and decentralized governance. Representative projects include:

XmaquinaDAO — Physical AI DAO (https://www.xmaquina.io)

XMAQUINA is a decentralized ecosystem providing global users with high liquidity participation channels for top humanoid robots and embodied intelligence companies, bringing opportunities that were once exclusive to venture capital firms on-chain. Its token DEUS serves as both a liquidity index asset and a governance vehicle for coordinating treasury allocation and ecological development. Through the DAO Portal and Machine Economy Launchpad, the community can participate in the tokenization and structuring of machine assets on-chain, jointly owning and supporting emerging Physical AI projects.

GAIB — The Economic Layer for AI Infrastructure (https://gaib.ai/)

GAIB aims to provide a unified economic layer for AI infrastructure such as GPUs and robots, connecting decentralized capital with real AI infrastructure assets to build a verifiable, composable, and yield-generating intelligent economic system.

In the robotics direction, GAIB does not "sell robot tokens," but instead achieves the transformation of "real cash flow → on-chain composable yield assets" by financializing robot devices and operational contracts (RaaS, data collection, remote operation, etc.) on-chain. This system covers hardware financing (leasing/staking), operational cash flow (RaaS/data services), and data flow revenue (licensing/contracts), making robot assets and their cash flows measurable, priceable, and tradable.

GAIB uses AID/sAID as settlement and yield vehicles, ensuring stable returns through structured risk control mechanisms (over-collateralization, reserves, and insurance), and long-term access to DeFi derivatives and liquidity markets, forming a financial closed loop from "robot assets" to "composable yield assets." The goal is to become the economic backbone of intelligence.

▲ Web3 Robotics Ecosystem Map: https://fairy-build-97286531.figma.site/

Conclusion and Outlook: Real Challenges and Long-term Opportunities

From a long-term vision perspective, the integration of robotics × AI × Web3 aims to build a decentralized machine economy system (DeRobot Economy), promoting embodied intelligence from "standalone automation" to "verifiable, settleable, governable" networked collaboration. Its core logic is to form a self-circulating mechanism through "Token → Deployment → Data → Value Redistribution," enabling robots, sensors, and computing nodes to achieve rights verification, trading, and profit-sharing.

However, from a realistic stage perspective, this model is still in the early exploration phase, far from forming stable cash flow and scalable commercial closed loops. Most projects remain at the narrative level, with limited actual deployment. Robotics manufacturing and operation are capital-intensive industries, and relying solely on token incentives is insufficient to support infrastructure expansion; while on-chain financial designs offer composability, they have yet to solve the risk pricing and yield realization issues of real assets. Therefore, the so-called "self-circulation of machine networks" remains idealized, and its business model requires real-world validation.

Model Intelligence Layer is currently the most valuable long-term direction. OpenMind, as a representative open-source robot operating system, attempts to break closed ecosystems and unify multi-robot collaboration and language-to-action interfaces. Its technical vision is clear, and the system is complete, but the engineering workload is enormous, and the verification cycle is long, with no industry-level positive feedback yet formed.
Machine Economy Layer is still in the preparatory stage, with a limited number of robots in reality, and the DID identity and incentive network are still difficult to form a self-consistent cycle. Currently, we are far from a "machine labor economy." In the future, only after embodied intelligence achieves large-scale deployment will the economic effects of on-chain identity, settlement, and collaboration networks truly manifest.
Data Collection Layer has the lowest barriers, but is currently the most commercially viable direction. The data collection for embodied intelligence requires high spatiotemporal continuity and action semantic accuracy, determining its quality and reusability. Balancing "crowdsourcing scale" and "data reliability" is a core challenge for the industry. PrismaX's approach of first locking in B-end demand and then distributing tasks for collection and verification provides a replicable template to some extent, but the ecological scale and data trading still require time to accumulate.
Perception and Simulation Layer is still in the technical verification phase, lacking unified standards and interfaces, and has not formed an interoperable ecosystem. The standardization of simulation results for transfer to real environments is limited by Sim2Real efficiency.
Asset Yield Layer sees Web3 mainly playing a supportive role in supply chain finance, equipment leasing, and investment governance, enhancing transparency and settlement efficiency, rather than reshaping industrial logic.

Of course, we believe that the intersection of robotics × AI × Web3 still represents the starting point of the next generation of intelligent economic systems. It is not only a fusion of technological paradigms but also an opportunity for reconstructing production relations: when machines possess identity, incentives, and governance mechanisms, human-machine collaboration will shift from localized automation to networked autonomy. In the short term, this direction remains primarily narrative and experimental, but the institutional and incentive frameworks it lays down are paving the way for the economic order of future machine societies. From a long-term perspective, the combination of embodied intelligence and Web3 will reshape the boundaries of value creation — allowing intelligent agents to become truly verifiable, collaborative, and yield-generating economic entities.

This independent research report is supported by IOSG Ventures. Special thanks to Hans (RoboCup Asia-Pacific), Nichanan Kesonpat (1kx), Robert Koschig (1kx), Amanda Young (Collab+Currency), Jonathan Victor (Ansa Research), Lex Sokolin (Generative Ventures), Jay Yu (Pantera Capital), and Jeffrey Hu (Hashkey Capital) for their valuable suggestions on this article. During the writing process, feedback was also solicited from project teams such as OpenMind, BitRobot, peaq, Auki Labs, XMAQUINA, GAIB, Vader, Gradient, Tashi Network, and CodecFlow. This article strives for objective and accurate content, but some viewpoints involve subjective judgments and may inevitably contain biases; readers are kindly asked to understand.

This article was assisted by AI tools from ChatGPT-5 and Deepseek during the creation process. The author has made efforts to proofread and ensure the information is true and accurate, but some omissions may still exist; your understanding is appreciated. It should be particularly noted that the cryptocurrency market generally exhibits a divergence between project fundamentals and secondary market price performance. The content of this article is intended for information integration and academic/research communication, does not constitute any investment advice, and should not be regarded as a recommendation for buying or selling any tokens.