Founder and CEO of Nvidia Jensen Huang speaks during The New York Times annual DealBook Summit in New York City on Nov. 29, 2023.
Michael M. Santiago | Getty Images
Nvidia found itself at the center of the artificial intelligence boom last year as its expensive server graphics processors, including the H100, became essential for training and deploying generative AI such as OpenAI’s ChatGPT. Now, Nvidia is playing up its strength in consumer GPUs for so-called “local” AI that can run on a PC or laptop from home or an office.
Nvidia announced three new graphics cards on Monday — the RTX 4060 Super, RTX 4070 Ti Super and RTX 4080 Super — ranging in price between $599 and $999. These cards have additional “tensor cores” that are designed to run generative AI applications. Nvidia will also provide graphics cards in laptops from companies such as Acer, Dell and Lenovo.
Demand for Nvidia’s enterprise GPUs, which cost tens of thousands of dollars each and often come in a system with eight GPUs working together, led to a surge in overall Nvidia sales and a market value of more than $1 trillion.
GPUs for PCs have long been Nvidia’s bread and butter, aimed at running video games, but the company says this year’s graphics cards have been improved with an eye toward running AI models without sending information back to the cloud.
The new consumer-level graphics chips will be primarily used for gaming, but can still rip through AI applications, the company says. For example, Nvidia says the RTX 4080 Super can generate AI video 150% faster than the last-generation model. Other software improvements the company recently announced will make large language model processing five times faster, Nvidia said.
“With 100 million RTX GPUs shipped, they provide a massive installed base for powerful PCs for AI applications,” Justin Walker, Nvidia’s senior director of product management, told reporters at a press conference.
Nvidia expects new AI applications to emerge over the next year to take advantage of the increased horsepower. Microsoft is expected to release a new version of Windows later this year, Windows 12, which can take further advantage of AI chips.
The new chip can be used to generate images on Adobe Photoshop’s Firefly generator or to remove backgrounds in video calls, Walker said. Nvidia is also creating tools that would allow game developers to integrate generative AI into their titles, for example, to generate dialogue from a nonplayer character.
Edge vs. Server
Nvidia’s 4070 Ti Super graphics cards.
Nvidia
Nvidia’s chip announcements this week show that while it has been the company most associated with big server GPUs, it’s going to compete with Intel, AMD and Qualcomm in local AI as well. All three have announced new chips that will power so-called “AI PCs” with specialized parts for machine learning.
Nvidia’s move comes as the technology industry is working out the best way to deploy generative AI, which requires a huge amount of computing power and can cost an incredible amount to run on cloud services.
One technical solution, being promoted by Microsoft and Nvidia rivals, is what’s called the “AI PC” or sometimes called “edge compute.” Instead of using powerful supercomputers over the internet, devices will have more powerful AI chips inside them, and they can run so-called large language models or image generators, albeit with some trade-offs and shortcomings.
Nvidia proposes applications that can use a cloud model for tricky questions, and a local AI model for tasks that need to be done quickly.
“Nvidia GPUs in the cloud can be running really big large language models and using all that processing power to power very large AI models, while at the same time RTX tensor cores in your PC are going to be running more latency-sensitive AI applications,” said Nvidia’s Walker.
The new graphics cards will be compliant with export controls and can be shipped to China, the company said, offering an alternative for Chinese researchers and companies that can’t get Nvidia’s most powerful server GPUs.