What’s Next in AI: NVIDIA’s Jensen Huang Talks With WIRED’s Lauren Goode
Hello everyone. Welcome, Siggraph. It is my first Siggraph. I'm so excited to be here.
I'm so excited to speak to all of you, and I'm so excited to speak to NVIDIA founder and CEO Jensen Huang. Thank you. Great to see you again.
Thank you. Great to see you. Welcome to Siggraph Lauren. Welcome to my hood. You're a regular here, huh? Hey, everybody. Great to see you guys. Jensen, it is like 99 degrees outside.
I know it's freezing in here, isn't it? I mean... I dunno I'm shaking. Leather jacket. Yeah. Feels great.
Looks good too. All right so you have- I’m wearing a brand new one. Oh, a brand new one. Yeah.
How many of those do you have? I don't know, but Lori got me a new one for Siggraph. She said you're excited about Siggraph? Here's a new jacket. Go, go do a good job.
It looks sharp. Thank you. All right. You have a long history of Siggraph. I mean, when you think about the history of this conference, which has been going on since 1974, and you think about the history of NVIDIA from the 1990s onward, where your DNA was really in, you know, computer graphics, helping to make beautiful graphics.
What is the significance of NVIDIA being here today, right now at a conference like Siggraph? Well, you know, Siggraph used to be about computer graphics. Now it's about computer graphics and generative AI. It's about simulation.
It's about generative AI. And we all know that the journey of NVIDIA, which started out in computer graphics, as you said, really brought us here. And so I made a cartoon for you.
I made a cartoon for you of our journey. Did you make it or did generative AI make it? Oh, hang on a second. I had it made. I had it made. That's what CEOs do. We don't do anything. We just have it be done. It kind of starts something like this.
Hey, guys, Wouldn't it be great if we had a cartoon and it illustrated some of the most important milestones in the computer industry and how it led to NVIDIA and where we are today. And also do it in three hours. In three hours. And do it in three hours, right. And so this is this cartoon here is really terrific.
So these are some of the some of the most important moments in the computer industry. the IBM system 360, of course, the invention of modern computing. the teapot in 1975. The Utah teapot.
1979, Ray tracing. Turner Whitted. One of the great researchers, NVIDIA researcher for a long time. 1986 programmable shading.
Of course, most of the animated movies that we see today wouldn't be possible if not for programmable shading originally done on the Cray supercomputer. led to, And then in 1993, NVIDIA was founded. Chris, Curtis and I founded the company. 1995 Windows PC revolutionized, the personal computer industry, put a personal computer in every home and every desk. multimedia PC was invented.
In 2001 we invented, the first programmable shading GPU and that really drove the vast majority of NVIDIA's journey up to that point. But at the background of everything we we're doing was accelerated computing. And we believed that you could create a type of computing model that could augment the general purpose computing so that you can solve problems that normal computers can't.
And the application we chose first was computer graphics, and it was probably one of the best decisions we ever made because computer graphics was insanely computationally intensive and remains so, for the entire 31 years that, NVIDIA has been here and since the beginning of computer graphics, in fact, it required a Cray supercomputer to render some of the original scenes. So it kind of tells you how computationally intensive it was. And it was also incredibly high volume because we applied computer graphics to an application at the time that, wasn't mainstream - 3D graphics video games. The combination of very large volume, very complicated computing problem led to a very large R&D budget for us, which drove the flywheel of our company.
That observation we made in 1993 was spot on, and it led us to be able to pioneer the work that we're doing in accelerated computing. We tried it many times. Cuda was, of course, the revolutionary version, but prior to that, we had a computing model we call CG, C for graphics, C on top of GPUs.
And so we've been working on accelerated computing for a long time, promoting and evangelizing Cuda, getting Cuda everywhere and putting on it every single one of our GPUs so that this computing model was compatible with any application that was written for it, irrespective of which generation of our processors. That was a great decision. And one day in 2012, we made our first contact. You know, Star Trek first contact, with artificial intelligence.
That first contact was AlexNet and was, in 2012, very big moment. we made the observation that AlexNet was an incredible breakthrough in computer vision. But at the core of it, deep learning was deeply profound, that it was a new way of writing software instead of engineers given input, imagining what the output was going to be, write algorithms. We now have a computer that, given input and example outputs, would figure out what the program is in the middle. That observation, and that we can use this technique to solve a whole bunch of problems that the previously were unsolvable was a great observation, and we changed everything in our company to pursue it, from the processor to the systems to the software stack, all the algorithms. NVIDIA basic research pivoted towards working on deep learning.
By the way, this is a great place for research. As you know, NVIDIA's, passionate about, Siggraph. And this year we have 20 papers that are at the intersection of generative AI and simulation. And so in 2016, we introduced the first computer we built for deep learning, and we called it DGX-1. And I delivered the first DGX-1 outside of our company. I built it for NVIDIA to build models for self-driving cars and robotics and such, and generative AI for graphics.
But we, somebody saw an example of DGX-1. Elon reached out to me and said, hey, I would love to have one of those for a startup company we're starting. And so I delivered the first one to a company at the time, that nobody knew about called OpenAI. And so that was 2016.
2017 was the transformer, that revolutionized modern, machine learning, modern deep learning. In 2018, right here at Siggraph, we announced RTX, the world's first Real-time interactive ray tracer, ray tracing platform called RTX. It was such a big deal that we changed the name of GTX, which everybody referred to our graphics cards as, to RTX. Another shout out for, a great researcher. His name is Steven Parker. Many of you know he's been coming to Siggraph for a long time.
he passed this year, and, he was one of the one of the core, pioneer researchers behind real time ray tracing. And we miss him dearly. And so, anyways. And you mentioned last year during your Siggraph keynote that RTX ray tracing extreme was one of the big, important moments when computer graphics met AI. That's right.
But that had been happening for a while, actually. So what was so important about RTX in 2018? Well, RTX in 2018, so, you know, we, we accelerated, ray traversal and bounding box detection and, and, and we made it possible to, use a parallel processor to accelerate ray tracing. But even then we were ray tracing at about, you know, one frame every call it, you know, ten frames, maybe every second, let’s say.
Maybe five frames every second, depending on how many rays we're talking about tracing. And we were doing it at 1080 resolution. Obviously video games, need a lot more than that. Obviously real time graphics need more than that. And this crowd definitely knows what that means, but for the folks who are watching online, who don't work in this field, this is basically a way of really manipulating light and computer graphics. Simulating how light interacts with- True to life, happening in real time.
That's right. The rendering processes used to take a really long time when you were making something- It used to take a Cray supercomputer to render just a few pixels, and now we have RTX to accelerate ray tracing. But it was interactive, it was real time, but it wasn't fast enough to be, a video game. And so we realized that that we needed a big boost, probably something along the lines of 20x or so, maybe 50x or so boost.
And so, the team, invented DLSS, which basically renders one pixel while it uses AI to infer a whole bunch of other pixels. And so we basically taught an AI that is conditioned on what it saw. And then it fills in the dots for everything else.
And now we're able to render fully ray traced, fully path traced simulations at 4K resolution at 300 frames per second, made possible by AI. And so 2018 came along, 2022, as we all know, ChatGPT came out. What's that again? ChatGPT, you know. Okay. ChatGPT, you know that. Open AI's ChatGPT a revolutionary new capability AI and fastest growing service in history. But the two things that I wanted to highlight since ChatGPT, the industry researchers, many of them in the room, has figured out how to use AI to learn everything, not just words, but to learn the meaning of images and videos and 3D, chemicals, protein, physics, thermal dynamics, fluid dynamics, particle physics.
It's figured out the meaning of these, all these different modalities. And since then, not only have we learned it, we can now generate it. And so that's the reason why you could go from text to images, text to 3D, images to text, 3D to text, to text to video, so on and so forth. Text to proteins, text to chemicals. And so now generative AI has been made possible.
And this is really the revolutionary time that we're in. Just about every industry is going to be affected by this just based on based on some of the examples I've already given you, whether it's scientific computing, trying to do a better job, predicting the weather with, a lot less energy to, augmenting and collaborating with creators to generate images or, you know, generating virtual scenes for industrial digitalization. and very importantly, robotics, self-driving cars are all going to be transformed by generative AI. And so here we are in this brand new way of doing things. And so let me just very quickly, Lauren, if you look at where we started in the upper left in 1964, the way that software was programmed, human engineers programming software, now we have machines that are learning how to program the software to writing software that no humans can, solving problems that we could barely imagine before.
And... now, because we have generative AI a new way of developing software. And, I don't know if you know, do you know Andre Karpathy? He's a really, really terrific researcher. I met him when he was at Stanford and, he coined the original way of doing software.
Software 1.0, machine learning to be software 2.0. And now really we're moving toward software 3.0 because these generative AIs in the future, instead of using machine learning to learn a new AI for every researcher, you'll probably start with pre-trained models, foundation models that are already pre-trained.
And the way that we develop software could very, very much be like assembling teams with experts of various AI capabilities, some that are using, tools, some that are able to generate special things, and then a general purpose AI that's really good at reasoning, that's connecting this network of AIs together, solving problems like teams solve problems. And so software 3.0 is here. I've gotten the sense from talking to you recently that you are optimistic that these generative AI tools will become more controllable, more accurate.
We all know that there are issues with hallucinations, low quality outputs that people are using these tools and they're maybe not getting exactly the output that they're hoping for. Right? Meanwhile, they're using a lot of energy, which we're going to talk about. Why are you so optimistic about this? What do you think is pointing us in the direction of this generative AI actually becoming that much more useful and controllable? Well, the big breakthrough of ChatGPT, was reinforcement learning human feedback, which was the way of using humans to produce the right answers or the best answers to align the AI on our core values or align our AI on the skills that we would like it to perform. That's probably the extraordinary breakthrough that made it possible for them to open ChatGPT for everyone to use. Other breakthroughs have arrived since then. Guardrailing, which, which causes the AI to focus its energy or focus its response in a particular domain so that it doesn't wander off and pontificate about all kinds of stuff that you ask it about.
It would only focus on the things that it's been trained to do, aligned to perform, and that it has deep knowledge in. The third breakthrough is called, retrieval augmented generation, which basically is vectorized, or data that has been, embedded so that we understand the meaning of that data. It's a more authoritative data set. It goes beyond just the trained data set. And it actually pulls from other sources.
That's right. It's not just pre-trained data source. It's something, you know, for example, it might be all of the articles that you've ever written, all of the papers that you've ever written. And so now it becomes, something an AI that's authoritative on your, and it could be essentially, a chat bot of you. So everything that I've ever written or ever said could be vectorized and then created into a semantic database.
And then before an AI responds, it would, it would look at your prompt and it would, search, the appropriate content from that vector database and then augment it, in its generative process. And you think that is one of the most important factors, These three combinations really made it possible for us to do that with text. Now, the thing that's really cool is that we are now starting to figure out how to do that with images.
Right. And, you know, Siggraph is really about a lot about images and generation. And so if you look at today's generative AI, you could give it a prompt and it goes into, in this particular case, this is an Edify AI model that NVIDIA created. It's a text to 2D foundation model.
It's multimodal. And we partnered with Getty to use their library of data, to train an AI model. And so this is a text to 2D image. And you also created this slide personally, right? I had I personally had the slide created.
And so. Imagine, I'm the prompt and then there's a team that's like a generative AI and then magically, this slide shows up. And so here's a prompt, and, and this could be a prompt for somebody who owns a brand. It could be a brand of, for in this case, Coca-Cola, it could be a car, it could be a luxury product, it could be anything.
And so, you you use the prompt and generate the image. However, as you know, it's hard to control this prompt, and it may hallucinate. It may, create it in such a way that is not exactly what you want. And to fine tune this using words is really, really hard because as you know, words is very low dimensionality.
It's extremely compressed in its content. But it's very imprecise. And so the ability for us to now control that image is difficult to do. And so we've created a way that allows us to control and align that with more conditioning.
And so the way you do that is we create another model. And this model, for example, allows us to text to 3D on the bottom. It's Edify 3D, one of our foundation models, we've created this AI foundry where partners can come and work with us.
And we create the model for them with their data. We invent the model and they bring their data and we, create a model that they can take with them. Is it only using their data? So this only uses all of the data that's available on Shutterstock. that they have the rights to use to train. And so we now use Prompt Generator 3D.
We put that in Omniverse. Omniverse as you know, is a is a place where you could compose, data and content from a lot of different modalities. It could be 3D. It could be AI, it could be, animation and it could be materials.
And so we use Omniverse to compose all of these multi-modality, data. And now you can control it. You could you could change the pose. You could change the placement. You could change whatever you like.
And then you use that image out of Omniverse to condition the prompt. Okay. So you take what comes out of Omniverse, you now augment it with the prompt.
It's a little bit like retrieval augmented generation. This is now 3D augmented generation. Getty, the Edify model, is multimodal. So it understand the image, understands the prompt, and it uses it in combination to create a new image.
So now this is a controlled image. And so this way we can use generative AI as a collaborator, as a, you know, as a partner to work with us, and we can generate images exactly the way we like it. How does this translate to the physical world? How does it translate to something like robotics? Well, we're going to talk about robotics, but one of the things that I would love to show you, and I had this made not by myself. Well, I had it made myself.
Okay, this is an incredible video. And this is, this is a work that is done by WPP. Shutterstock. Working with, some of the brand, world class, world famous brands that you'll, you'll know. Let's run the video.
Build me a table in an empty room surrounded by chairs in a busy restaurant. Build me a table with tacos and bowls of salsa in the morning light. Build me a car on an empty road, surrounded by trees, by a modern house. Build me a house with hills in the distance and bales of hay in the evening sun.
Build me a tree in an empty field. Build me hundreds of them in all directions, with bushes and vines hanging in between. Build me a giant rainforest with exotic flowers and rays of sunlight. Isn’t that incredible? And so, This is what happened.
We taught an AI how to speak USD, open USD. And so the young, the girl is speaking to Omniverse. Omniverse generates the USD and uses USD search to then find the catalog of, 3D objects that it has. It composes the scene using words and then generative AI uses that augmentation to generate, to condition the generation process and so therefore, the work that you do could be much, much better controlled.
You could even collaborate with people because you can collaborate in Omniverse, and you can collaborate in 3D. It's hard to collaborate in 2D. And so we can collaborate in 3D, Augment the generation process. I imagine a lot of people in this room who aren’t just technical, but they're also storytellers. This is a very technical room. Storytellers see something like this. It's like 90% PhDs in here.
I'm not even going to ask you to do a raising of your hand, but I'm sure that would be fascinating. So they see something like this, I see something like this. And I think, okay, that's pretty amazing. You are speeding up rendering times. You're creating images out of nothing.
There are probably just as many people thinking, what does this mean for my job? Where do you draw the line between this is augmenting and helping people, where do you see the line being drawn, and this is replacing certain things that humans do? Well, that's what tools do. We invent tools here. This, you know, this conference is about inventing technology that ultimately ends up being a tool. And a tool either accelerates our work, collaborates with us so that we could do better work or even bigger work.
do work that's, impossible before. And so I think what you're going to what you'll likely see is that generative AI, is now going to be more controllable than before. We've been able to do that with using, RAGs, retrieval augmented generation to control text generation, better reducing hallucination. Now we're using Omniverse with generative AI to control generative images better and reduce hallucination. Both of those tools, help us be more productive and do things that we otherwise can't do. And so I think, I think, for all of the artists in the world, what I would say is, jump on this tool, give it a try.
Imagine the stories that you're going to be able to tell, with these tools, and, with respect to jobs, I would say that it is very likely all of our jobs are going to be changed. In what way? Well, my job is going to change the way in the future. I'm going to be prompting a whole bunch of AIs. Everybody will have an AI that is an assistant. And so every single company, every single company, every single job within the company will have AIs that are assistants to them.
Our software programmers, as you know, now have AIs that help them program. all of our software engineers have AIs that help them debug software. we have AIs that help our chip designers design chips. Without AI, Hopper wouldn't have been possible. Without AI Blackwell wouldn't be possible. You know, today, this week, we're sampling - we're sending out engineering samples of Blackwell, all over the world.
They're under people's chairs right now. I think if you just look, you get a GPU, you get a GPU. Yeah, you get a GPU, you get a GPU.
Yeah. That's right. Supply chain, what? We all wish. Yeah. And so, none of the work that we do would be possible anymore without generative AI.
And that's increasingly the case with our IT department helping our employees be more productive. It's increasingly the case with our supply chain team optimizing supply, to be as efficient as possible, or our data center team, you know, using AI to manage the data centers to save as much energy as possible. You mentioned Omniverse before. Yeah. that's not new. But the idea that more generative AI would be within the Omniverse helping people create these simulations or digital twins. Yeah, that's what we're announcing this week, by the way.
Talk about that. Omniverse now, understands, text to USD. it could understand text to, and has a semantic database so that it could do a search of all the 3D objects.
and, and that's how that the young lady was able to to say, fill, the scene with a whole bunch of trees, describing how she would like the trees to be organized and somehow populates it with all these 3D trees. Then when that's done, that 3D scene then goes into a generative AI, model, which turns it into a photorealistic model. And if you want the Ford truck to not be augmented, but you use the, the, the actual brand, brand ground truth. then it would, it would honor that and keep that, keep that in the final scene. And so, I think if you do that, so one of the things that we talked about is how every single, group in the company, will have AI assistants. And there's a lot of questions, lately about, whether all this infrastructure that we're building is leading to productive work in companies.
I just gave you an example of how generative AI is impossible without- NVIDIA's designs would be impossible without generative AI. So we use it to transform the way we work, but we also use it, and many examples that I've just shown you in creating new products and new technology that either makes possible, ray tracing in real time, or Omniverse that we can now, imagine and help us create much larger scenes. Our self-driving car work or our robotics work, none of that that new capability would be possible without it. And so one of the things that we're announcing here, this week is, the concept of digital agents, digital AIs that will augment every single job in the company. And so, one of the, one of the the most important use cases that people are discovering is customer service. And, every single group, every single company has customer service, every single industry has customer service.
And in the future, it's going - today, it's humans, doing customer service. But in the future, my guess is it's going to be humans still, but AI in the loop. And the benefit of that is that you'll be able to, retain, the experiences of all the customer service agents that you have and capture that institutional knowledge that you can then run into analytics, that you can then, use to create better services for your customers. Just now, I showed you an Omniverse augmented generation for images.
This is a RAG. This is a retrieval augmented generative AI. And the thing that we're doing is, we've created this customer service, basically, microservice that sits in the cloud, and it's going to be available, I think, today or tomorrow.
And, you can come and try it and we’ve connected to it a digital human front end, basically an IO. The IO of an AI that has the ability to, speak, make eye contact with you, and animate in an empathetic way. And you could decide to connect your ChatGPT or your AI to the digital human, or you could connect, your digital human to our retrieval augmented generation, customer service AI. so however you like to do it, we're a platform company.
So irrespective of which piece you would like to use, they're completely open source, and you can come and use the pieces that you like. If you would like the incredible digital human rendering technology that we've created for rendering beautiful faces, which require subsurface scattering with path tracing, this breakthrough is really quite incredible, and it makes it possible for us. Amazing graphics researchers, welcome to Siggraph 2024. So it makes it possible to animate, using an AI. So, you chat with the AI, it generates text.
That text then is translated, to sound, text to speech, that speech, the sound that animates the face. And, then RTX path tracing, does the, does the rendering of the digital human. And so all of this is available for developers to use. And you could decide which parts you would like to use.
How are you thinking about the ethics of something like this? You're unleashing this to developers, to graphics artists, but these are being unleashed into the world. Do you think a chat bot like that, a very human-like visual chat bot should say that it's a chat bot. What it is, is it so human that people start mistaking it for humans. They're emotionally vulnerable. It's still, it's still pretty robotic.
And I think that that's not a terrible thing. You know, we're going to stay. We're going to be robotic for some time. I think we've made this digital human technology quite realistic.
But you and I know it's still a robot. And so I think, that's not a horrible way. It is the case that there are many, many different applications where the human engagement is much, much more engaging, having a human representation or near human representation than a text box.
Maybe somebody needs companion or health care needs to advise, somebody who is an outpatient, who just got home, you know, helping elderly, there's a whole bunch of, applications, a tutor, to educate a child. all these different applications are better off having somebody who is much more human and being able to connect with, with the audience. That's interesting. What I hear you talking a lot about today, these are software developments, right? They're relying on your GPUs, but ultimately this is software.
This is NVIDIA going further up the stack. Meanwhile, there are some companies, some folks in the generative AI space who are in software and cloud services, but they're looking to go further down the stack right? They might be developing their own chips or TPUs that are competitive with what you are doing. How crucial is this software strategy to NVIDIA maintaining its lead and actually fulfilling some of these promises of growth that people are looking at for NVIDIA right now? Well, we've always been a software company, and even first, and the reason for that is because accelerated computing is not general purpose computing. General purpose computing can take any C program a C++ program, Python, and just run it. And almost everybody's program can be compiled to run effectively.
Unfortunately, when you want to accelerate fluid dynamics, you have to understand the algorithms of fluid dynamics so that you could, refactor it in such a way that it could be accelerated. And you have to design an accelerator, you have to design a Cuda GPU so that it understands the algorithms so that it could do a good job accelerating it. And the benefit, of course, is that we can by doing so, by redesigning the whole stack, we can accelerate applications 20, 40, 50 times, 100 times.
For example, we just put, NVIDIA GPUs in, GCP, running Pandas, which is the world's leading, data science platform. And we accelerate from 50x to 100x over general purpose computing. In the case of deep learning, over the course of last 10 to 12 years or so, we've accelerated deep learning about a million times, which is the reason why it's now possible for us to create these large language models a million times speed up, a million times reduction in cost and energy is what made it possible for us to make generative AI possible. But that's by designing a new processor, a new system, tensor core GPUs, the NVLink switch fabric, is completely groundbreaking for AI.
Of course, the systems itself, the algorithms, the distributed computing libraries we call Megatron that everybody uses, TensorRT-LLM, those are algorithms. And if you don't understand the algorithms, the applications above it it's really hard to figure out how to design that whole stack. What is the most important part of NVIDIA's software ecosystem for NVIDIA's future? Well, every single one of it takes a new library.
We call it DSLs, domain specific library. in, in generative AI that DSL is called cuDNN. For sequel processing data frames is called cuDF. And so SQL pandas, cuDF is what makes it possible for us to accelerate that. For quantum emulation, it's called cuQuantum. cuFFT, we got a whole bunch of cu’s, computational lithography, which makes it makes it possible for us to, help the industry advance the next generation of process technology called cuLitho, the number of cu's goes on and on.
Every time we introduce a domain specific library, it exposes accelerated computing to a new market. And so as you see, it takes that collaboration, the the full stack of the library and the architecture and the go to market and developers and the ecosystems around it to open up a new field. And so it's not just about building the accelerator. You have to build a whole stack.
NVIDIA is dependent on a lot of things going right. Your foray into the future, your innovation, depends on a lot of things going right. You have to continue pushing the laws of physics. You do have competitors who are nipping at your heels at all times. We've talked about this. What keeps you up at night.
You're also somewhat reliant on the geopolitical stability- Last night, just so you know, elevation. Drink some water. That's what they told me. But it was too late by the time I learned about it, this morning, I woke up with a terrible headache. Elevation.
That's what was last night. Okay, so elevation. Okay, so elevation. but truly, you have to keep pushing the laws of physics. You have competitors who are nipping at your heels, both in the software and hardware side, you are somewhat reliant on the geopolitical stability of the South China Sea, Geopolitics. So much going on right now.
You're reminding me it's a super hard building a company. You're making me nervous. I was fine before.
There's so many things you have to do. He wants to go back to showing you slides. But truly, you've had you've had a lot of tailwinds, and I'm just, you know, how optimistic are you that these things are going to keep trending in your direction? Things have never trended in our direction. You have to will the future into existence. Accelerated computing, you know, the world wants general purpose computing. And the reason for that is because it's easy.
You just have the software. It just runs twice as fast every single year. Don't even think about it.
And, you know, every five years is ten times faster. Every ten years is a hundred times faster. What's not to love? But of course, you could shrink a transistor, but you can't shrink an atom. And eventually, the CPU architecture, ran its course. And so it's not sensible anymore, as the technology doesn't give us those leaps that a general purpose instrument could be good at everything, you know, could be good at these incredible things from deep learning to quantum simulation to molecular dynamics to the fluid dynamics, right, to computer graphics.
And so we created this accelerated computing architecture to do that. But that fights, that fight, that's headwind. Do you see what I'm saying? Because general purpose computing is the easy way to do it. We've been doing it for 60 years.
Why not keep doing it and so, so accelerated computing was only possible because we deliver such extraordinary speedups at a time when energy is becoming more scarce at a time when we no longer could just ride the CPU curve any longer. Dennard scaling is as has really, ended. And so we need another approach and, and that's why we're here. But notice every single time we want to open up a new market like cuDF in order to do data processing. Data processing is probably what, a third of the world's data, a third of the world's computing. Every company does data processing, and most companies data is in data frames, you know, in tabular format.
And so in order to create an acceleration library for tabular formats was insanely hard, because what's inside those tables could be floating point numbers, 64 bit integers. It could be, you know, numbers and letters and all kinds of stuff. And so we have to figure out a way to go compute all that. And so, so you see that almost every single time we want to grow into something you have to go and learn it.
That's the reason why we’re working on robotics, that's the reason why we're working on autonomous vehicles to understand the algorithms that's necessary to open up that market and to understand the computing layer underneath it so that we can deliver extraordinary results. And so, as you can see, each each time we open up a new market, health care, digital biology, the work with the amazing work we're doing there. With BioNeMo and, Parabricks for gene sequencing. Every single time we open up a new market, it just requires us to reinvent everything of that computing. And so there's nothing easy about it. Generative AI takes a lot of energy.
I'm just saying my job super hard. But your assistants. Your AI assistants are going to make it easier, right? What's that? Somebody’s got to pat my back.
Hey, a little applause you guys. Cheer him on. Yeah. Go ahead.
Let's talk about energy. Yeah. Generative AI, incredibly energy intensive.
I am going to read from my note cards here. according to some research, ChatGPT, a single query takes up nearly ten times the electricity to process a single Google search. Data centers consume 1 to 2% of overall worldwide energy, but some say that it could be as much as 3 to 4%. Some say as much as 6% by the end of the decade. Data center workloads
tripled between 2015 and 2019. That was only 2019. AI generative AI is taking up a large portion of all of that. Is there going to be enough energy to fulfill the demand of what you want to build and do? Yeah. Yes.
And, a couple of observations. So first there, there are 2 or 3, or 3 or 4, model makers that are pushing the frontier. A couple of years ago, they're probably three times that many this year, but it's still it's still single digit, you know, it's very high single digit, but call it ten that are pushing the frontiers of, of models and the size of the models are, call it, twice as large every year, maybe, maybe, faster than that. And in order to train a model that's twice as large, you need, you know, more than twice as much data.
And so the computational load is growing. probably, by a, you know, call it a factor of four each year just for simple, simple, simple thinking. Now, that's one of the reasons why Blackwell, is, is, so highly anticipated because we accelerated the application, so much using the same amount of energy, and so this is an example of accelerating applications, at constant energy, constant costs. You're making it cheaper and cheaper and cheaper.
Now, the important thing, though, is I've only highlighted ten companies. The world has tons of companies and there are data centers everywhere. And NVIDIA is selling GPUs to a whole lot of companies in a whole lot of different data centers. And so the question is what's happening? At the core, the first thing that's actually happening is the end of CPU scaling and the beginning of accelerated computing data processing. just text completion, speech recognition, all of those kind of basic AI things that are, that are used, recommender systems, that are used and, and data centers all over the world. They are moving- everyone is moving from CPUs to accelerated computing because, they want to save energy.
Accelerated computing helps you save so much energy 20 times, 50 times, and doing the same processing. So the first thing that we have to do, you know, as a society is accelerate up every application we can. If you're doing Spark data processing, run it with accelerated Spark so that you could reduce the amount of energy necessary by 20 times. If you're doing SQL processing, do SQL, accelerated SQL so that you could reduce the power by 20 times. And so, if you're doing weather simulation, accelerate it.
When you're doing whatever scientific simulation you're doing. Accelerate it. Image processing, accelerate it. A lot of those applications used to be, running on CPUs in general purpose computing.
All of that should be accelerated going forward. That's the first thing that's happening now, is that reducing the amount of energy being used all over the world, absolutely. The density of our GPUs and density of accelerated computing is higher energy density is higher, but the amount of energy used, it's dramatically lower. So that's the first thing that's happening.
Of course, then generative AI. Generative AI is probably consuming Let's pick up very large number, probably a, you know, 1% or so of the world's energy. But remember, even if the data centers consume 4% of the world, the goal of generative AI is not training. The goal of generative AI is inference and the inference, ideally, we create new models for predicting weather, predicting new materials allow us to, optimize our supply chain, reduce the amount of energy consumed and wasted gasoline, as we deliver products. And so the goal is actually to reduce the energy consumed of the 96%.
And so, very importantly, you have to think about generative- about AI from a longitudinal perspective, not just going to school, but what happens after going to school. You and I both went to Stanford. Stanford is not inexpensive. I think you studied something slightly different, though.
Yeah. Yeah, sure. It's a big school. It's worked out well for you. It's worked out well for both of us.
And so and so so the goal, of course, is going to school is important. But of course, the important thing is really after school and all of the contributions that we're able to make the society. So generative AI is going to increase productivity, is going to enable us to discover new science, make things more energy efficient. Don't let me don't let me finish without without showing you, the next- And so, so that accelerated computing- The lights just came on because we were talking about energy and all of a sudden it's like the Earth was like, okay, tamp down the energy usage, folks. I thought I thought they were- Am I getting chased out? I think we still have a few minutes.
I mean, I hope so. I mean, I'm not I'm not going to get off the stage until Mark Zuckerberg comes on here and kicks me off. How about that? He's not going to do that. He's a great guy. So anyways, think think, think about generative AI, longitudinally and all the impact of generative AI. The second thing, the next thing I'll say about generative AI is remember, it's the traditional way of doing computing, it's called retrieval based computing. Everything is prerecorded.
All the stories are written prerecorded, all the images are prerecorded, all the videos are prerecorded. And so everything is stored off in a data center somewhere, prerecorded. Generative AI reduces the amount of energy necessary to go run to a data center over the network, retrieve something, and bring it over the network. Don't forget, the data center is not the only place that consumes energy. The world's data center- that is only 40% of the total computing done.
60% of the energy is consumed on the internet. Moving the electrons around, moving the bits and bytes around. And so generative AI is going to reduce the amount of energy on the internet, because instead of having to go retrieve the information, we can generate it right there on the spot because we understand the context. We probably have some, content already on the device, and we can generate the response so that you don't have to go retrieve it, somewhere else. Well, part of that is also moving atoms around, right? One last, one last thing I got to take one last thing I remember. AI doesn't care where it goes to school.
Today's data centers are built near the power grid, where society is, of course, because that's where we need it. In the future, you're going to see data centers being built in different parts of the world where there's excess energy. It's just that it cost a lot of money to bring that energy to society. Maybe it's in a desert, maybe it's, in places that has a lot of sustainable energy, but it’s just not- They’re already taking up a lot of water. Well, there's plenty of water as well.
It just happens to be undrinkable water. And so, we can use we can use water that are, we can we can put data centers where there's less population and, more energy. Just don't don't forget that there's there's plenty of there's a lot of energy coming from the sun. There's a lot of energy in the world.
And what we need to do is move data centers out closer to where there's excess energy and not put everything near population. AI doesn't care where it's trained, but I'd never heard that phrase before. I'd never heard that phrase before AI doesn't care where it goes to school and that's interesting. Yeah. It's true. I'm going to think on that. Part of, part of calculating the carbon emissions though, is also considering the supply chain.
It's also considering it's going all down the line, and it requires transparency. Don't, don't move, don't move the energy to the data center. Use the energy where the data center is. And then when you're done, you have a highly compressed model that is essentially the compression of all the energy that was used. And we can take that model and bring it back. Hey, can we talk about the next wave? So the first wave, of course, the first wave is accelerated computing.
I know that she's the interviewer and we're on we're, we're we're doing this on her terms. But I'm the CEO so... and and so. No, Lauren. We need to come and tell tell this, this group about the work that we're doing that, that that is, is really, really core to- I have so many good questions for you.
I know, I know, I want to ask you about open source, which I think you're going to be talking to Mark about? I want to ask you about... By the way, open source is really important. It's incredible. Yeah. If not, if not for open source. If not for open source, how would all of these industries and all these companies be able to engage AI? And so look at look at all the companies in all the different industries.
They're all using Llama. Llama 2 today. Llama 3.1 just came out. People are so excited about it.
We’ve made it possible to democratize AI and engage every single industry in AI. But the thing that I want to say is this the first wave is accelerated computing reduces energy consumed, allows us to deliver continued computational demand without all of the power continuing to grow with it. So, number one, accelerate everything. It made it possible for us to have generative AI, that generative AI, the first wave of it, of course, is all the pioneers. And we you know, we know many of the pioneers OpenAI, Anthropic, Google, Microsoft, a whole bunch of amazing companies doing this.
X is doing this, X.AI is doing this, amazing companies doing this. The next the next wave of AI, we didn't talk about, which is, enterprise, of course, one of its applications is customer service.
And we hope that we can, give every single organization the ability to create their own AIs. And so everybody would be augmented and and to have a collaborative AI, they could, empower them, helping them do better work. The next wave of AI after that is called physical AI.
And this is this is really, really quite extraordinary. This is where we're going to need three computers, one computer to, create the AI, another computer to simulate the AI, both using synthetic, for synthetic data generation, as well as a place where the AI robot, the humanoid robot, or the manipulation robot could go learn how to, refine its AI. And then, of course, the third AI is the computer that actually runs the AI. So it's a three computer problem.
You know, it's a three body problem. And so it's incredibly complex. It's incredibly complicated.
And we created three computers to do that. And we made a video, for you for you to enjoy, understanding this. The thing that, that we've done here is this, in every, each one of these computers, depending on whether you want to use the software stack, the algorithms on top, or just the computing infrastructure or just the processor for the robot or the functional safety operating system that runs on top of it, or the AI and computer vision models that run on top of that, or, just the computer itself.
Any piece is any layer of that stack is open. for robotics, developers, we created a quick video. Let's take a look at that. Is that okay? That sounds great. The era of physical AI is here. Physical AI, models that can understand and interact with the physical world will embody robots.
Many will be humanoid robots. Developing these advanced robots is complex, requiring vast amounts of data and workload orchestration across diverse computing infrastructures. NVIDIA is working to simplify and accelerate developer workflows with three computing platforms NVIDIA AI, Omniverse and Jetson Thor, plus generative AI enabled developer tools.
To accelerate Project GR00T, a general humanoid robot foundation model, NVIDIA researchers capture human demonstrations, seeing the robots hands in spatial overlay over the physical world. They then use RoboCasa, a generative simulation framework integrated into an NVIDIA Isaac Lab to produce a massive number of environments and layouts. They increase their data size using the MimicGen NIM, which helps them generate large scale synthetic motion data sets based on the small number of original captures.
They train the group model on a NVIDIA DGX Cloud with the combined real and synthetic data sets. Next, they perform software in the loop, testing in Isaac SIM in the cloud and hardware in the loop validation on Jetson Thor before deploying the improved model to the real robots. NVIDIA Osmo Robotics Cloud Compute Orchestration Service manages job assignment and scaling across distributed resources throughout the workflow. Together, these computing platforms are empowering developers worldwide to bring us into the age of physical, AI powered humanoid robots. You know.
You know what's amazing, Lauren? At this conference, Siggraph is where all of this technology comes together. Isn't that right? Everybody? Researchers of Siggraph, isn't that right? So whether it's computer graphics or simulation, artificial intelligence, robotics, all of it comes right, comes together right here at Siggraph. And that's the reason why I think you should come to Siggraph from now on.
Me? Yes! I'm happy to. I'm thrilled to. Am I right? Everybody? 100% of the world's tech press should come to Siggraph. We can get behind that. Just drink a lot of water. I went and saw some of the art exhibits last night upstairs in the exhibition. Fantastic.
Just really, really cool. I loved the literal spam bots. Whoever created that one, go check it out. I was actually listening to the Siggraph spotlight podcast before this.
If folks haven't listened, I really recommend it. the Special Projects chair was interviewing a couple of graphics legends, including David Em. And one of the things that David Em talked about was archives.
And this is kind of an existential question for this crowd, right? But people are creating this really amazing digital media, all these computer graphics. You are accelerating it with your technology. It changes so fast now. How do you ensure that everything folks are building lives into the future? File formats, archives, accessing all of this work in the future? The robots will live on. Yep, I have no concern they're going to take over. Yeah, right. Yeah.
What about the art that people are creating? This is this is the existential question. Well, one of the one of the one of the that's an excellent question. And one of the, one of the, the formats that we deeply believe in is open USD. Open USD is the first format that brings together multimodality from almost every single tool and allows it, to interact, to be composed together, to go in and out of these virtual worlds.
And so you could bring in just about any format, ideally over time, into it. At this conference, we announced that URDF, the universal robotics data format, is now compatible with or you can be ingested into, into open USD. And so one, one format after another format, we're going to bring everything into this one common, one common language. Using standards is one of the best ways to allow content and data to, be shared. allow everybody to collaborate on it and to live forever. For example, HTML. Without HTML, it would have been hard for all of these
different content from around the world to be accessible to everybody. And so in a lot of ways, open USD is the HTML of virtual worlds. And we've, we've been a, early promoter of it. There's, amazing companies that have joined and many other companies joining. And, my expectation is every single design tool in the world will be able to connect to open USD. And once you connect to that virtual world, you can collaborate with, anybody with any other tool anywhere.
And so, just like we did with HTML. You said this content can live forever. Are you going to build a Jensen AI that lives forever? Absolutely.
There's a Jensen AI. In fact, just about everything that I've ever said, everything that I've ever written and ever done, will likely be ingested into one of these, generative AI models. And I'm hoping that that happens. And then, in the future, you'll be able to prompt it and and hopefully something smart gets said.
So Jensen AI is going to be, running your earnings calls in the future. I hope so. That's that's the first thing that has to go. That's the first thing that has to go to a bot.
Jensen thank you so much. I think we're probably going to get kicked off stage soon, but you'll be back shortly with Mark Zuckerberg. Yes. And welcome to your first Siggraph.
Ladies and gentlemen. Lauren Goode. Thank you. It's really great chatting again. Thank you everybody. We'll be right back.
2024-08-01 05:57