Nvidia used to be just a graphics chip vendor, but CEO Jensen Huang wants you to know that the company is now a full-service IT provider and that may be an artificial construct.
With such ambitions, Nvidia is turning to the cloud, providing both hardware and software as a service. At the company’s GTC Fall conference last week, Huang showed off a few new toys for gamers, but he spent most of his keynote describing the tools Nvidia offers CIOs to speed up computing. in the business.
There was hardware for industrial designers in the new Ada Lovelace RTX GPU; a chip to drive autonomous vehicles while entertaining passengers; and the state-of-the-art IGX computing platform for stand-alone systems.
But it wasn’t just hardware. Software (for drug discovery, biology research, language processing, and building metaverses for industry) and services such as consulting, cybersecurity, and software and infrastructure as a service in the cloud were also present.
Huang punctuated his keynote with demos of a single processor performing real-time photorealistic rendering of scenes with natural lighting effects, an AI that can seamlessly fill in missing frames to smooth and speed up the process. animation, and a way to train large language models for AI that allow them to respond to prompts in context. The quality of those demos made it at least somewhat plausible when, in a videoconference with reporters after the keynote, on-screen Huang joked, “Don’t be surprised if I’m an AI. “
Kidding aside, CIOs will want to pay close attention to Nvidia’s new set of cloud services, as it could allow them to deliver new functionality across their organizations without increasing equipment budgets. At a time when hardware costs are likely to escalate and the industry’s ability to fit more transistors into a given area of silicon has stalled, challenges remain for many.
“Moore’s Law is dead,” Huang said, referring to Gordon Moore’s 1965 statement that the number of transistors on chips would double approximately every two years. “And the idea that a chip will go down over time, unfortunately, is a thing of the past.”
Many factors contribute to the problems of chipmakers like Nvidia, including the difficulty of obtaining vital tools and the rising cost of raw materials such as neon gas (the supply of which has been affected by the war in Ukraine) and silicon wafer chips are made from .
“A 12-inch wafer is much more expensive today than it was yesterday,” Huang said. “And it’s not a bit more expensive, it’s a ton more expensive.”
Nvidia’s response to these rising costs is to develop software optimized for customers to get the most out of its processors, helping to restore a price-performance balance. “The future is all about full-stack acceleration,” he said. “Computing is not a chip problem. Computing is a software and chip problem, a full stack challenge.
NeMo Fine Tuning
To underscore this point, Nvidia announced that it is already busy optimizing its NeMo large language model training software for its new H100 chip, which has just entered full production. The H100 is the first chip based on the Hopper architecture that Nvidia unveiled at its Spring GTC conference in March. Other deep learning frameworks optimized for the H100 include Microsoft DeepSpeed, Google JAX, PyTorch, TensorFlow and XLA, Nvidia said.
NeMo also has the distinction of being one of the first two Nvidia products to be sold as a cloud-based service, the other being Omniverse.
The NeMo Large Language Model Service allows developers to train or adapt responses from large language models built by Nvidia for processing or predicting responses in human languages and computer code. Associated service BioNeMo LLM does something similar for protein structures, predicting their biomolecular properties.
Nvidia’s latest innovation in this area is allowing companies to take a model built from billions of parameters and refine it using a few hundred data points, so a chatbot can provide answers. more appropriate to a particular context. For example, if a chatbot asks “What are the rental options?” he might reply, “You can rent a modem for $5 a month,” if he was tuned for an ISP; “We can offer economy, compact and full-size cars,” for a car rental company; or, “We have units ranging from studios to three bedrooms,” for a property management agency.
Such tuning, Nvidia said, can be done in hours, whereas training a model from scratch can take months. Tuned templates, once created, can also be invoked using a “prompt token” combined with the original template. Companies can run the models on-premises or in the cloud or, starting in October, access them in Nvidia’s cloud via an API.
Nvidia’s Omniverse platform is the foundation for the company’s other suite of cloud services.
Huang described the platform as having three key features. One is the ability to ingest and store three-dimensional information about worlds: “It’s a modern cloud database,” Huang said. Another is its ability to connect devices, people or software agents to this information and to each other. “And the third one gives you a window into this new world, another way of saying it’s a simulation engine,” Huang said.
These simulations can be of the real world, in the case of companies creating digital twins of manufacturing facilities or products, or fictional worlds used to train sensor networks (with Omniverse Replicator), robots (with Isaac Sim ) and autonomous vehicles. (with Drive Sim) by providing them with simulated sensor data.
There’s also Omniverse Nucleus Cloud, which provides a shared universal scene description store for 3D scenes and data that can be used for online collaboration, and Omniverse Farm, a scalable tool for scene rendering and data generation. synthetics using Omniverse.
Industrial giant Siemens is already using the Omniverse platform to develop digital twins for manufacturing, and Nvidia said the company is now working to provide these services to its customers using Omniverse Cloud.
Omniverse Farm, Replicator, and Isaac Sim are already available in containers that enterprises can deploy on Amazon Web Services compute cloud instances equipped with Nvidia GPUs, but enterprises will have to wait for the general availability of other Omniverse Cloud applications as services managed by Nvidia. The company is now accepting applications for early access.
Nvidia is also opening up new channels to help businesses consume its new products and services. Management consulting provider Booz Allen Hamilton is offering businesses a new cybersecurity service it calls Cyber Precog, based on Nvidia Morpheus, an AI cybersecurity processing framework, while Deloitte will offer services to companies around Nvidia’s Omniverse software suite, the companies announced at the GTC.
While Nvidia is working with consultants and systems integrators to roll out its SaaS and hardware rental offerings, that doesn’t mean it’s going to stop selling hardware. Huang noted that some organizations, usually start-ups or those that only use their infrastructure sporadically, prefer to rent, while larger, established companies prefer to own their infrastructure.
He likened the process of training AI models to running a factory. “Nvidia is now in the factory business, the most important factory of the future,” he says. Where today’s factories receive raw materials and manufacture products, he said: “In the future, factories will receive data, and what will come out of it will be intelligence or models.
But Nvidia needs to bundle its hardware and software factories for CIOs in different ways, Huang said: “Just like factories today, some people prefer to outsource their factory, and others prefer to own it. It just depends on the business model you are in.