NVIDIA has announced significant advancements in generative physical AI, introducing new NIM microservices and the NVIDIA Metropolis reference workflow at SIGGRAPH. These innovations are designed to improve the training of physical machines and enhance their ability to handle complex tasks, according to NVIDIA Blog.
Generative AI in Physical Environments
Generative AI technology, already widely used for writing and learning, is now poised to assist in navigating the physical world. NVIDIA’s new offerings include three fVDB NIM microservices that support deep learning frameworks for 3D worlds and several USD NIM microservices for working with Universal Scene Description (USD), also known as OpenUSD.
The newly developed OpenUSD NIM microservices work in tandem with generative AI models to enable developers to integrate generative AI copilots and agents into USD workflows, thereby expanding the capabilities of 3D environments.
NVIDIA NIM Microservices Transform Physical AI Landscapes
Physical AI employs advanced simulations and learning methods to help robots and other automated systems perceive, reason, and navigate their surroundings more effectively. This technology is revolutionizing industries such as manufacturing and healthcare by advancing smart spaces and enhancing the functionality of robots, factory technologies, surgical AI agents, and autonomous vehicles.
NVIDIA provides a suite of NIM microservices tailored for specific models and industry applications, supporting capabilities in speech and translation, vision and intelligence, and realistic animation and behavior.
Turning Visual AI Agents Into Visionaries
Visual AI agents, which leverage computer vision capabilities, are designed to perceive and interact with the physical world. These agents are powered by vision language models (VLMs), a new class of generative AI models that bridge digital perception and real-world interaction. VLMs enhance decision-making, accuracy, interactivity, and performance, enabling visual AI agents to handle complex tasks more effectively.
Generative AI-powered visual AI agents are being rapidly deployed across various sectors, including hospitals, factories, warehouses, retail stores, airports, and traffic intersections. NVIDIA’s NIM microservices and reference workflows for physical AI provide developers with the tools needed to build and deploy high-performing visual AI agents.
Case Study: K2K Enhances Palermo’s Traffic Management
In Palermo, Italy, city traffic managers have deployed visual AI agents using NVIDIA NIM to gain physical insights and better manage roadways. K2K, an NVIDIA Metropolis partner, integrates NIM microservices and VLMs into AI agents that analyze live traffic camera feeds in real time. This allows city officials to ask questions in natural language and receive accurate insights and suggestions for improving city operations, such as adjusting traffic light timings.
Bridging the Simulation-to-Reality Gap
Many AI-driven businesses are adopting a “simulation-first” approach for generative physical AI projects. NVIDIA’s physical AI software, tools, and platforms, including NIM microservices and reference workflows, help streamline the creation of digital representations that accurately mimic real-world conditions. This approach is particularly beneficial for manufacturing, factory logistics, and robotics companies.
Vision language models (VLMs) are widely adopted across industries due to their ability to generate realistic imagery. However, they require immense volumes of data for training. Synthetic data generated from digital twins offers a powerful alternative, providing robust datasets for training physical AI models without the high costs and limitations of real-world data acquisition.
NVIDIA’s tools, such as NIM microservices and Omniverse Replicator, enable developers to build synthetic data pipelines for creating diverse datasets, enhancing the adaptability and performance of models like VLMs.
Availability
Developers can access NVIDIA’s state-of-the-art AI models and NIM microservices at ai.nvidia.com. The Metropolis NIM reference workflow is available in the GitHub repository, and Metropolis VIA microservices are available for download in developer preview. OpenUSD NIM microservices are also available in preview through the NVIDIA API catalog.
Image source: Shutterstock