Building physical AI with virtual simulation data

Software screenshot as virtual simulation data is driving the development of physical AI across corporate environments, led by initiatives like Ai2’s MolmoBot.

Advancements in Physical AI Driven by Virtual Simulation Data

The development of physical artificial intelligence (AI) in corporate environments is being propelled by virtual simulation data. Initiatives like Ai2’s MolmoBot are leading the way in this innovative field.

Traditionally, instructing hardware to interact with the real world has been costly and labor-intensive, relying on manually-collected demonstrations. Technology providers working on generalist manipulation agents typically emphasize extensive real-world training as the foundation for these systems.

For example, projects such as DROID and Google DeepMind’s RT-1 have required tens of thousands of episodes collected over months by human operators. This reliance on proprietary, manual data collection not only inflates research budgets but also limits capabilities to a select few well-funded industrial laboratories.

Ali Farhadi, the CEO of Ai2, stated, “Our goal is to develop AI that pushes the boundaries of science and enables humanity to make new discoveries. Robotics can serve as a fundamental scientific tool, facilitating faster research progress and exploration of new frontiers. To achieve this, we need AI systems that can generalize in real-world settings and tools that the global research community can collaborate on. Demonstrating the transition from simulation to reality is a significant step in this direction.”

Researchers at the Allen Institute for AI (Ai2) are introducing a novel economic model with MolmoBot, a suite of open robotic manipulation models trained entirely on synthetic data. By generating trajectories procedurally within the MolmoSpaces system, the team eliminates the need for human teleoperation.

The MolmoBot-Data dataset consists of 1.8 million expert manipulation trajectories created by combining the MuJoCo physics engine with domain randomization techniques to vary objects, viewpoints, lighting, and dynamics.

Ranjay Krishna, the Director of the PRIOR team at Ai2, highlighted, “Most approaches focus on bridging the gap between simulation and reality by incorporating more real-world data. Our approach challenges this by expanding the diversity of simulated environments, objects, and camera conditions. Our latest breakthrough shifts the robotics paradigm from manual demonstrations to the design of more realistic virtual worlds, presenting a problem that we can solve.”

Generating Virtual Simulation Data for Physical AI

Utilizing 100 Nvidia A100 GPUs, the pipeline developed approximately 1,024 episodes per GPU-hour, translating to over 130 hours of robot experience for every hour of real-time operation.

Compared to traditional real-world data collection methods, this approach offers nearly four times the data throughput, significantly enhancing project return on investment by accelerating deployment timelines.

The MolmoBot suite comprises three distinct policy classes evaluated on the Rainbow Robotics RB-Y1 mobile manipulator and the Franka FR3 tabletop arm. The primary model, based on a Molmo2 vision-language framework, processes multiple RGB observations and language instructions to determine actions.

Hardware Flexibility with Ai2’s MolmoBot

For edge computing environments with resource constraints, the researchers present MolmoBot-SPOC, a lightweight transformer policy with fewer parameters. MolmoBot-Pi0 utilizes a PaliGemma backbone to align with the architecture of Physical Intelligence’s π0 model, enabling direct performance comparisons.

During physical testing, these policies demonstrated the ability to seamlessly transfer to real-world tasks involving unseen objects and environments without any fine-tuning.

In tabletop pick-and-place assessments, the primary MolmoBot model achieved a success rate of 79.2%. This outperformed the π0.5 model, trained on extensive real-world data, which achieved a success rate of 39.2%. For mobile manipulation tasks, the policies successfully executed actions such as approaching, grasping, and manipulating doors through their full range of motion.

Offering diverse architectures allows organizations to integrate efficient physical AI systems without being tied to a single vendor ecosystem or requiring extensive data collection infrastructure.

The comprehensive release of the entire MolmoBot stack, including training data, generation pipelines, and model architectures, enables internal review and customization. Individuals exploring physical AI can leverage these open tools for simulation and the development of capable systems while managing costs effectively.

Ali Farhadi reiterated, “For AI to drive scientific progress, it must not rely on closed data or isolated systems. It necessitates shared infrastructure that researchers worldwide can collaborate on, test, and enhance collectively. This is the path we believe physical AI will advance along.”

Learn more about AI and big data from industry leaders at the AI & Big Data Expo in Amsterdam, California, and London. This event, part of TechEx, offers a comprehensive platform for exploring the latest in technology, including cybersecurity and cloud solutions.

AI News is a platform powered by TechForge Media. Discover upcoming enterprise technology events and webinars to stay updated on the latest trends and innovations in the field.