Known for its powerful GPUs, NVIDIA is now turning its attention to robotics. The company recently launched Cosmos, a platform to accelerate the development of physical AI systems, such as autonomous vehicles (AVs) and robotics.
“NVIDIA will be a robotics company at the end of the day, not just semiconductor. Few understand the moves they are making at the lowest level from manufacturing up to software,” Dylan Patel, founder of Semianalysis, said.
In a recent interview, NVIDIA chief Jensen Huang said that the world needs an AI that understands the physical world. “It has to understand the dynamics of the physical world, like gravity, inertia, or friction, and it has to understand spatial and geometric relationships,” Huang said.
According to him, the world needs more robots as today there aren’t enough workers. “There’s an ageing population, a changing preference in the type of work people want to do, and the birth rate is declining. The world needs more workers, so the timing is really quite imperative that we have robotic systems.”
NVIDIA is working to integrate robotics primarily into manufacturing processes, autonomous vehicles, and the healthcare sector. For example, humanoid robots in manufacturing can perform repetitive tasks, handle materials, and collaborate with human workers. Huang predicts that the $50 trillion manufacturing industry will become software-defined.
The future Huang envisions is already taking shape. For example, BMW is using the Figure 02 humanoid robot on its production line. The company claims that Figure 02 can now operate as an ‘autonomous fleet’, with a 400% increase in speed and a sevenfold improvement in success rate.
The Foundation of NVIDIA Robotics
Huang referred to Cosmos as the “ChatGPT or Llama of world foundation models”. The platform has been trained on 20 million hours of video, focusing on dynamic interactions like walking humans and hand movements.
He further said that the real magic happens when Cosmos is integrated with Omniverse. The combination of the two provides “ground truth” for AI, which helps it understand the physical world. Huang compared the connection of Cosmos and Omniverse to the concept of LLMs connected to retrieval-augmented generation (RAG) systems.
Moreover, Huang introduced the concept of three fundamental computers essential for building robotic systems. The first is Deep GPU Xceleration (DGX), which is used to train AI. Once the AI is trained, the next computer, AGX, is employed to deploy AI into real-world applications such as cars or robots.
The third component is the Digital Twin, a simulated environment where the AI practices, refines its abilities and undergoes further training before deployment.
Backed by Strong Research
This is not the first time NVIDIA is discussing humanoids and autonomous vehicles. For the past year, the company has been actively researching this field.
“It gives me a lot of comfort knowing that we are the last generation without advanced robots everywhere. Our children will grow up as ‘robot natives’. They will have humanoids cook Michelin dinners, robot teddy bears tell bedtime stories, and FSD (full self-driving) drive them to school,” Jim Fan, senior research manager and lead of Embodied AI (GEAR Lab) at NVIDIA, said.
In another interview, Fan said that the company chose humanoid, considering that the world is built around the human form factor. “All our restaurants, factories, hospitals, and all equipment and tools are designed for the human form.”
Notably, the company recently unveiled project Eureka and demonstrated a demo where they trained a five-finger robot hand to spin a pen.
Besides, NVIDIA recently developed HOVER (humanoid versatile controller), a 1.5 million parameter neural network designed to coordinate the motors of humanoid robots for locomotion and manipulation.
“Not every foundation model needs to be gigantic. We trained a 1.5 million-parameter neural network to control the body of a humanoid robot,” Fan revealed.
NVIDIA launched Project GR00T and the Isaac platform last year. GR00T is a framework that allows developers to generate extensive synthetic datasets from a limited number of human actions.
The company has also developed Jetson Thor, a new generation of compact computers for humanoid robots, which is slated for release in the first half of 2025.
NVIDIA is working with companies such as 1X Technologies, Agility Robotics, Apptronik, Boston Dynamics, Figure AI, Fourier Intelligence, Sanctuary AI, Unitree Robotics and XPENG Robotics to build humanoid robots.
World Models x Humanoids
It seems that NVIDIA is not alone in the robotics race. According to OpenAI’s career page, the startup is hiring for roles in mechanical engineering, robotics systems integration, and program management. The goal is to “integrate cutting-edge hardware and software to explore a broad range of robotic form factors”.
Last year, the company hired Caitlin Kalinowski to lead its robotics and consumer hardware divisions. Previously at Meta, she oversaw the development of Orion augmented reality (AR) glasses. OpenAI has also invested in Figure AI and robotics AI startup Physical Intelligence.
Similarly, Apptronik, one of the leaders in AI-powered humanoid robotics, recently announced an exciting new partnership with Google DeepMind’s robotics team to create truly intelligent and autonomous robots.
Tesla is also upping its game. At the ‘We, Robot’ event at Warner Bros. Studio in California last year, CEO Elon Musk showcased the company’s humanoid robot, Optimus. It can walk dogs, mow the lawn, do household chores, babysit, and even speak like the GenZ while making smooth hand gestures.
Meanwhile, Fan’s mentor and ‘AI godmother’ Fei-Fei Li, recently founded her own company, World Labs, to build large world models (LWMs) that can perceive, generate, and interact with the 3D world.