Published on February 28, 2025
In AI News

Physical Intelligence Launches ‘Hi Robot’, Helps Robots Think Through Actions

The system allows the robot to take real-time feedback in natural language, and “talk to itself” as it performs tasks.

by Sanjana Gupta

Researchers at Physical Intelligence, an AI robotics company, have developed a system called the Hierarchical Interactive Robot (Hi Robot). This system enables robots to process complex instructions and feedback using vision-language models (VLMs) in a hierarchical structure.

Vision-language models can control robots, but what if the prompt is too complex for the robot to follow directly?

We developed a way to get robots to “think through” complex instructions, feedback, and interjections. We call it the Hierarchical Interactive Robot (Hi Robot). pic.twitter.com/KdL5myyybT
— Physical Intelligence (@physical_int) February 26, 2025

The system allows robots to break down intricate tasks into simpler steps, similar to how humans reason through complex problems using Daniel Kahneman’s ‘System 1’ and ‘System 2’ approaches.

In this context, Hi Robot uses a high-level VLM to reason through complex prompts and a low-level VLM to execute actions.

Testing and Training Using Synthetic Data

Researchers used synthetic data to train robots to follow complex instructions. Relying solely on real-life examples and atomic commands wasn’t enough to teach robots to handle multi-step tasks.

To address this, they created synthetic datasets by pairing robot observations with hypothetical scenarios and human feedback. This approach helps the model learn how to interpret and respond to complex commands.

It outdid other methods, including GPT-4o and a flat Very Large Array (VLA) policy, by better following instructions and adapting to real-time corrections. It achieves a 40% higher instruction-following accuracy than GPT-4o. Hence, it demonstrates better alignment with user prompts and real-time observations.

In real-world tests, Hi Robot performed tasks like clearing tables, making sandwiches, and grocery shopping. It effectively handled multi-stage instructions, adapted to real-time corrections, and respected constraints.

Synthetic data, in this context, highlights potential in robotics to efficiently simulate diverse scenarios, reducing the need for extensive real-world data collection.

Hi Robot ‘Talks to Itself’

As seen in an example below, a robot is trained to clean a table by disposing of trash and placing dishes in a bin. It can be directed to follow more intricate commands through Hi Robot.

Source: Official blog

This system allows the robot to reason through modified commands provided in natural language, enabling it to “talk to itself” as it performs tasks. Moreover, Hi Robot can interpret user contextual comments, incorporating real-time feedback into its actions, such as handling complex prompts.

This setup allows the robot to incorporate real-time feedback, such as when a user says “that’s not trash”, and adjust its actions accordingly.

The system has been tested on various robotic platforms, including single-arm, dual-arm, and mobile robots, performing tasks like cleaning tables and making sandwiches.

“Can we get our robots to ‘think’ the same way, with a little ‘voice’ that tells them what to do when presented with a complex task?” the researchers said in the company’s official blog. This advancement could lead to more intuitive and flexible robot capabilities in real-world applications.

Researchers plan to refine the system in the future by combining the high-level and low-level models, allowing for more adaptive processing of complex tasks.

📣 Want to advertise in AIM? Book here

Sanjana Gupta

An information designer who loves to learn about and try new developments in the field of tech and AI. She likes to spend her spare time reading and exploring absurdism in literature.

Dexterity AI Launches the ‘World’s First Industrial Superhumanoid Robot’

Google DeepMind’s Gemini Robotics to Build AI for Robots of all Shapes and Sizes

New Robotics Method AnyPlace Achieves Object Placement Through VLMs, Synthetic Data

UBTECH Advances Humanoid Robotics with Swarm Intelligence Training at Zeekr

Apptronik, Jabil Partner to Scale Apollo, Advancing Robots that Build Robots

Amazon Forms Frontier AI & Robotics Team to Revolutionise Automation

Figure Cracks AI for Humanoids Before OpenAI Can

ideaForge Partners with US-Based Vantage Robotics to Expand Drone Solutions

Association of Data Scientists

GenAI Corporate Training Programs

Our Upcoming Conference

Happy Llama 2025

India's Biggest Conference on AI Startups

April 25, 2025 | 📍 Hotel Radisson Blu, Bengaluru

Download the easiest way to
stay informed

‘Most Data Centres Are Not Ready for Liquid Cooling’, says Oracle Exec on NVIDIA Blackwell

Siddharth Jindal

Built on the Blackwell architecture introduced last year, Blackwell Ultra features the NVIDIA GB300 NVL72 rack-scale solution and the NVIDIA HG B300 NVL16 system.