New Amazon Web Service x NVIDIA partnership revealed to serve generative AI on the supercomputing scale; New microservice announced as well

NVIDIA has revealed they have sealed yet another deal with Amazon Web Services to expand on their products to cater to generative AI applications.

This new partnership will see AWS as the first cloud service provider to offer Team Green’s GH200 Grace Hopper Superchips with multi-node NVLink which takes advantage of the AWS 3rd-Gen Elastic Fabric Adapter (EFA) interconnect leading to up to 400Gbps per Superchip of low latency and high bandwidth networking throughput together with superior scalability in EC2 UltraClusters.

These specific EC2 instances will have 4.5 TB of HBM3e memory which is 7.2x more than H100-powered EC2 P5d instances plus the CPU-to-GPU memory interconnect providing up to 7x higher bandwidth than PCIe.

All of these will be deployed with a set of liquid cooling solutions for maximum efficiency and rack space utilization. Additionally, AWS Nitro System is also helping in the EC2 instances by offloading I/O functions to dedicated hardware for consistent performance and a protective execution environment.

On the other hand, NVIDIA DGX Cloud with NVIDIA AI Enterprise integrated will be joining AWS as well for easy and fast access to LLM and generative AI model training.

Here are some other AWS instances’ improvements through the latest collaboration looks like summarized:

AWS P5e
- H200 GPU with 141GB HBM3e memory (1.8x more capacity, 1.4x faster, up to 3200Gbps of EFA networking)
AWS EC2 G6e
- L40S GPU (video and graphics-related workload, cost-effective, energy efficient)
AWS EC2 G6
- L40 GPU (video and graphics-related workload, cost-effective, energy efficient)

Meanwhile, there’s this new NVIDIA NeMo Retriever microservice platform offering new tools to create super accurate chatbots and summarization tools using accelerated semantic retrieval, and BioNeMo, a specialized version for drug discovery utilized by big pharma, will be coming to AWS on NVIDIA DGX Cloud on top of the existing Amazon SageMaker.

Back to the topic of NeMo Retriever, the microservice is using NVIDIA-optimized algorithms to enable generative AI apps to churn out more accurate responses based on business data that is residing in the cloud or data centers.

This retrieval-augmented generation (RAG) capability is currently being worked with Cadence, Dropbox, SAP, and ServiceNow to make some production-ready models so that businesses can use them as reference and craft custom generative AI applications and services more quickly.

While there are certainly open-source RAG toolkits, NVIDIA’s NeMo Retriever “one-ups” them via commercially viable models API stability, security patches, and enterprise support.

Some examples come in the form of optimized embedding models that capture relationships between words for the highest accuracy results possible, and if needed, even support different data types like images, videos, and PDFs.

Huawei Malaysia And Pantas Partner Up To Accelerate The Adoption of Renewable Energy Solutions For Malaysian Businesses

News

Unleash and enjoy the full suite of RTX AI tech from NVIDIA via the PALIT GeForce RTX 4060 Ti Dual today

by Calvin Liew

July 18, 2024

Before 2023, "AI" might have seemed like a buzzword reserved for industry experts. But today, AI is everywhere, largely thanks...

News

Take a sneak peek of what’s coming in NVIDIA SIGGRAPH graphics conference on July 29

by Calvin Liew

July 17, 2024

This year's NVIDIA SIGGRAPH graphics conference will be held in Denver, Colorado on July 29 where Team Green as well...

Black Myth: Wukong packed in for free via new GeForce RTX 40 Series Bundle sales

July 10, 2024

NVIDIA’s Malaysia stop talks about GeForce RTX AI PCs and their capabilities; Proceeds to collab with brands to give new GeForce RTX 40 series users some sweet deals

June 30, 2024

New NVIDIA GeForce Game Ready Driver update brings DLSS 3.5 with Ray Reconstruction to The First Descendant and DLSS 3 to PAYDAY 3 and Riven

June 28, 2024

NVIDIA CEO Jensen Huang expects DLSS 4 to introduce “texture and model generation”

June 27, 2024

Subscribe via Email

Calvin Liew

Ex-competitive rhythm gamer who is always the "Good but not the best". You'd know me as Vindy if you know where to look. Currently on a quest to own enough keyboards with different plates and just slapping MX Black on them.