Part IV · THE GREAT CONVERGENCES

Robotics and embodied AI

Chapter 1311 min readUpdated: June 2026

13.1A brief history: from the industrial arm to the robot that learns

13.2Embodied AI and Vision-Language-Action (VLA) models

Diagram13.1. The principle of a VLA model. The robot perceives its environment (vision), receives an instruction in natural language, and the model translates the whole into motions. It is the equivalent, for the body, of what LLMs did for language.

13.3The humanoid race (overview, June 2026)

13.4Beyond humanoids

13.5Stakes: safety, employment, acceptability


Key takeaways (Chapter 13)

  • Robotics long rested on programmed and rigid machines. The turning point of 2024-2026 comes from the arrival of large models inside the body of robots.
  • Vision-Language-Action (VLA) models (Figure's Helix, Physical Intelligence's "pi," Google's Gemini Robotics, NVIDIA's GR00T, Unitree's UnifoLM) translate perception and instruction into motions, often after training in simulation.
  • The humanoid race pits American players (Figure, Tesla, Boston Dynamics, 1X, Apptronik) against Chinese ones (Unitree, AgiBot, UBTECH, XPeng), along distinct strategies (capability, model, integration or price).
  • The stake: crossing the tipping point (around $20-25k), where the data flywheel and the supply chain (strong dependence on Chinese components) come into play.
  • "Physical AI" goes beyond humanoids: cobots, warehouse robots, quadrupeds, drones, autonomous vehicles.
  • The stakes around safety (certifying a robot that learns), employment, acceptability and regulation are major (Chapters 24, 21 and 25).

Thus ends Part IV. We have explored the convergences of AI with blockchain, quantum and robotics. It is time to leave the technologies behind and observe their concrete effects on the world: science and health, work and the economy, law and society. That is the subject of Part V.