AWS Expands NVIDIA NIM Microservices for Enhanced AI Inference

Jessie A Ellis
Dec 04, 2024 20:28

AWS and NVIDIA enhance AI inference capabilities by expanding NIM microservices across AWS platforms, boosting efficiency and reducing latency for generative AI applications.

Amazon Web Services (AWS) has announced an expansion of its collaboration with NVIDIA, integrating NVIDIA NIM microservices into its key AI services. This move, revealed at the AWS re:Invent conference, aims to accelerate AI inference and reduce latency for generative AI applications, according to NVIDIA.

Enhanced AI Inference with NVIDIA NIM

NVIDIA NIM microservices are now readily accessible via the AWS Marketplace, Amazon Bedrock Marketplace, and Amazon SageMaker JumpStart. This availability simplifies the deployment of NVIDIA-optimized inference for popular models at scale. Part of the NVIDIA AI Enterprise software platform, NIM microservices offer secure, high-performance deployment of AI model inference across diverse environments.

These prebuilt containers leverage advanced inference engines, such as NVIDIA Triton Inference Server and NVIDIA TensorRT, supporting a wide range of AI models. Developers can utilize these services across various AWS platforms, including Amazon EC2 and Amazon EKS, enhancing model deployment flexibility and performance.

Broad Range of Supported Models

Developers can explore over 100 NIM microservices, featuring models from NVIDIA, Meta’s Llama 3, and Mistral AI, among others. These services are optimized for deployment on NVIDIA accelerated computing instances via AWS, providing robust solutions for AI model inference.

Notably, NVIDIA Nemotron-4 and Llama 3.1 models are now available directly from AWS, offering advanced capabilities for data synthesis and multilingual dialogue, respectively. These models are designed to enhance AI application performance and reliability across various domains.

Industry Adoption and Use Cases

Industries are increasingly adopting NIM on AWS to expedite market entry, ensure security, and reduce costs for generative AI applications. For example, IT consulting firm SoftServe has developed several AI solutions using NVIDIA NIM, now available on AWS Marketplace. These include applications for drug discovery, industrial assistance, and content creation, all leveraging NVIDIA AI Blueprints for accelerated development and deployment.

Getting Started with NIM on AWS

Developers interested in deploying NVIDIA NIM microservices can start by exploring the NVIDIA API catalog, which offers numerous NIM-optimized models. They can request a developer license or a trial license for NVIDIA AI Enterprise to begin deploying these microservices across AWS platforms. This initiative underscores AWS and NVIDIA’s commitment to advancing AI technology and facilitating seamless integration for developers.

Image source: Shutterstock

Share it on social networks