Meteor Lake: Built-In Intel Arc Graphics Deep Dive | Talking Tech | Intel Technology
- Hey everybody. Today we're gonna be talking about Meteor Lake, and most importantly, we're gonna be talking about Built-in Arc Graphics. (driving electronic music) Today we have an expert with us. We have Tap. Hi Tap. How's it going? - Hey.
Good to see you, Alex. - Good to see you too. Thank you so much for being here with us. And today we're gonna be talking about Arc Graphics. - Of course. Graphics is exciting stuff
- And we are very excited 'cause you now we're gonna have Built-In Arc graphics in Meteor Lake. - Yes. - But we're just focusing on graphics and wanna know everything about it today. - Yeah, well, I mean obviously we've made a huge upgrade with Meteor Lake Graphics.
I mean, doubling performance by 2x in one generation. So everybody's pretty excited. - Yeah, that's awesome. So yeah, that's what I wanted to ask you.
So what are kind of the goals set for Meteor Lake and also for Arc Graphics. - Well, yeah, I mean, many people are familiar that we've been doing Arc Graphics discreet GPUs for some time now, and we are busily improving that every day. Our drivers are getting more and more performant, our game compatibility's getting better and better. And I'd say we're actually really starting to participate in a significant way in the discrete gaming market.
And we've taken all of that technology now and brought it to integrated. A lot of people don't know, but Intel is actually the largest provider of graphics in the world and it's because of our market share and our capacity on integrated. - Yes, so all the built in graphics that we have in our processors, that's why. - Absolutely. Absolutely. And for the first time it's gonna be Arc Graphics inside of our CPU.
So you'll get all of the driver work, modern graphics, DX12 support, even Ultimate, DX12 Ultimate support. So that's things like ray tracing and all the rest of it. - Wow. - So it's a pretty big step. - That's pretty cool.
Okay, so let's dive a little bit into more. So let's talk about the architecture and how it's disaggregated. So tell us how it's been doing that.
- Yeah, Well hopefully people have heard about Meteor Lake using a disaggregated strategy. That means there's multiple different chips that all come together onto one substrate to build the device. Now what's cool about that is really two different things. One, you can actually turn on and off different chips optimizing for power. And the other thing you can do is you can have different processes per chip optimized for the performance needs of that silicon. So for example, the CPU core chip can actually have a different process from the graphics chip, which can be a different process from the IO chip.
And that's what we're doing on Meteor Lake. Now, the cool part is the graphics gets its own block. So depending on the market segment, you can actually have different tiles that you assemble to create market specific variation. So it's a whole new world enabled by disaggregation and we're just at the beginning.
- That's pretty cool. So depending on which, I guess, market you're serving you can change the graphic style and provide a completely different set of products. - Yeah, I mean, disaggregation has the potential to sort of say micro targeting of silicon. Now again, in Meteor Lake we're not gonna be going crazy, right? This is our first step, but I really do like the way the architecture opens us up for optimization on the power side and also on the performance side. - That's great, so how is this aggregation working for graphics? So how are the blocks kind of split into? - Well, I would say if you start at the beginning, there's a dedicated tile for the compute graphics, so that's what we call the graphics tile.
Our IO is in the IO interface tile. And then our Media blocks are actually in the SOC tile. So like instead of just having one space, one spot, that turns on for all graphics and media, we can now spread it around the chip.
What that does for us is primarily around media playback, which is something obviously everybody does on these notebooks. With media playback, we can keep the graphics tile completely off and only energize that media decode block for playback of any kind of content. And we don't have to spend the extra watts powering a GPU that's not doing anything.
- So that media blocked, the media engine, it's in the SOC, right? - Yup. - System On Chip? - Yup. It's right in the middle of the SOC. And so we'll power up a little co-processor, another CPU that's actually built on that die and that that kind of does the control plane, the media block does the heavy lifting for encoding and decoding, and at the end of the day you get a very low power, high quality playback. - So for example, if you're doing a playback of a video, it will just be using the media engine. - Yeah, and everything else is completely off.
The CPU tile's, off the GPU tile's off. And you're just powering up that SOC die and the IO. - No, that's great. Well, do you know if the low power cores will be engaged into that? - They won't turn on at all. There's actually a dedicated low power core inside of the media block, inside of that SOC tile, different from the compute tile. So it's actually a totally new architecture optimized for media playback.
- So we have all these blocks, right? And like you said before, we have 'em all together. - Yeah. - And I think you already covered a little bit, but why separate them? - Why separate them? Well, again, I think there's two main reasons. One is you wanna optimize the process for the compute requirements of a particular block. So for high performance CPU cores, you're gonna use the latest, greatest, fastest transistors you can get, and that's a special process. For, say, the SOC that's running at a lower bandwidth, you can use an older process, a more power efficient process.
And for the GPU we can actually use something that's in the middle 'cause we're mostly a computation die. And the IO is, again, a targeted process. That's one reason you disaggregate. The second reason, of course, is you can now turn on and off different parts, so you can power optimize your design based on the use case that's happening in real time.
- Going back to the media engine, what type of codecs can we actually encode, decode, and what are they use for? - Yeah, well there's a ton of codecs. I can't list them all, but the way to think about codecs, if you're not familiar, is there's like, when you do media playback, it's captured with a camera and it's then encoded into some format and you can think of that as the data format of a movie or whatever, and then that's transported and read by the GPU. So the first thing that happens is we decode whatever format is coming in and then we can usually transcode it or we can just send it out to display. So the codecs that we support, for example, on Meteor Lake include AV1. And AV1 is new, it actually allows us to do both better encoding and better decoding, supporting lower bit rates or higher quality. So AV1, a big new thing with Meteor Lake.
- Yeah, that's pretty exciting. So now we're gonna be able to encode and decode with AV1. - Yeah, of course. - That's great. - Yeah, Meteor Lake
is a media monster. So if you have a high performance notebook and you're doing creator stuff or you're doing content consuming, playing movies or streaming, it is going to be cool and quiet. - That's pretty amazing. All right, so moving on. So we have the graphic style, which we'll go back to it. And then we have the media, the pre-media playback. And we also have the display- - There's display block, there's media block, and there's a graphics block.
- Yeah, so why don't you tell us a little bit more about that, the display engine? - Sure, the way I think about the display engine, its primary job is to take images that are generated by you of the media block or the compute block and get 'em to the screen and it's gonna do that using standardized interfaces. Usually it's either some variant of HDMI or display port. On Meteor Lake we support HDMI 2.1, which is really cool because that allows us to support variable refresh rate natively. If people know me, they know I'm pretty excited about variable refresh rate technology. And that's now native on our integrated graphics.
We also support DP 1.4, which is actually gonna allow really high performance, high bandwidth on great looking monitors. I think we actually support up to 8K 60. So if you're doing 8K displays, we can actually display onto that at 60 hertz.
- Wow, that's pretty cool. - Yeah, it takes a lot of design, the engineers have been hard at work, high bandwidth interfaces, lots of correct DMA management and buffering, but we can actually get it all done. - Yeah, not long ago, just this doing playback of 8K was- - Crazy. - It's a little crazy.
- Yeah, and now 8K at 60 hertz is actually really impressive for a very light and low power device. - Yes, yes, yeah 'cause these are gonna be mobile apps. Yeah, it's pretty really amazing. So, going back to the media engine. What about the different blocks that constitute the media engine? - Oh, the media engine.
Well, it's actually pretty interesting because almost everybody knows that the media engine's responsible for decoding something and then it's gonna send it out and show it on the screen through the display engine. But most people don't know that it's doing these complex mathematics. So there's an entire art for encoding and decoding, and most of it's around things like spatial, where you're looking for patterns in an image and you're gonna compress those patterns. But it also does temporal where it's looking, from frame to frame, it's trying to find common patches so it can compress those as well.
So the media engine is just evolved over decades. And its incredibly complicated algorithms are optimizing, think of it as compression, as you're taking in a lot of raw data and looking for patterns and squeezing it down. Now it does exactly the opposite on decompression. It uses these algorithms that it has to unroll. So it's looking both temporally and spatially reconstructing high quality images. - So those kind of things that you were talking about, those compression. decompression, for example,
like fast Fourier transforms and signs and... - DFTs, yeah. - Yeah, exactly. - DCTs, all of those algorithms are just different ways to find these pixels. And you can think of it as there's really multiple different characteristics that you can compress over. As I mentioned, the spatial compression, you can do temporal compression, you can even do frequency space compression.
So it's an artwork that every time I talk to the media guys, I'm blown away by super scientists thinking about mathematical algorithms that only goal in life is to compress pixels. - It just blows my mind. - Yeah, it's actually pretty rocket science. I talked to them a little bit before Meteor Lake Day, just to say, "How would you talk about this?" And we're both, they're, obviously, when you talk to engineers, they're excited to talk about the technology, but they're also nervous because this is like really hard.
It's really super secret. We don't wanna give out too many details, but I think if you look at the Meteor Lake Day, we kind of hit a nice balance between technical content, we can't give away all the kitchen, all the sauce, but we do want people to understand that this is rocket science. But all it does is just makes images compress so you can save bandwidth and then it expands them to give high quality images.
- Yeah it is pretty impressive what you say 'cause when I started researching and learning about Meteor Lake, in order to come to interview you and also the other fellows. It was like, wow, this is just completely another level- - Oh yeah. - Of engineering. - I think that you'll see on the graphic side, it's also similarly ridiculous, but I find the media blocks to be particularly like the unsung hero because it's been there for so long, it's been doing its job so well, you just assume it's always gonna work and it's always gonna get better.
And that's why AV1 is so amazing because this is a technology that have people have been working on for 30 years. And to be able to make it dramatically better after 30 years of development is kind of shocking. And that's why I give those guys a solid handshake when I see 'em because they're able to find new techniques that are extracting better pixels, better images, at the same time lowering bandwidth. - Yes, it's amazing.
And also, when you were doing the presentation back in Technology Day in Malaysia it caused another light bulb went up for me. It was like, "Yeah, this is pretty similar to the neural processing unit that we have. There's a bunch of blocks that are like, very, very similar." I was like, "Oh, so that's kind of why."
- Yeah, there are. There's a lot of similarity between, say, what you might do in a computational flow for neural processing, neural networking, or media. I mean, it's just lots of math, lots of really high level thought of compression and that kind of moves forward through both domains. - All right, so we have talked about the media engine and also the display engine, so we're left over with the graphic style. - Yeah. - But that is very exciting, but I'll let you do the honors.
- Well, I'd start at the big things that we're doing there, right? We've done two main things. We've first taken all the goodness that is Arc that we've been working on for the last five years, which is supporting modern APIs, DX12, DX11, DX9, new features like ray tracing, VRS, other technologies, bringing all of that development and that driver work, that game compatibility work, bringing that to integrated, which is really, really cool. The second thing we're doing is improving the perf per dollar or the perf per watt. And we're also improving the performance. So think of it as we're building a bigger, faster engine that also is more modern using all the modern APIs and all the modern technologies. So that's a combination of silicon technology and architecture.
- Okay, so those changes. So I also seen that you guys have done a couple of changes that to make it better, which was like a higher clock frequencies, and also like architecture efficiency. - Yeah, so higher clocks come from two different aspects. First, we've changed the process. So we're using a more modern silicon process versus our last generation. But the second thing is our engineers have re-pipelined the entire architecture, so it's running faster at every voltage.
So a combination of tweaking the design to get faster at every voltage and getting a better process gives us higher frequencies, which is directly related to why do we get higher performance and why do we get higher performance per watt. The second part we're talking about just features and technologies, and that's again, using all the technology that came from our Xe-HPG architecture and now building that into our Xe-LPG architecture. - Wow, that's awesome.
I also kind of heard through the vine that there were some machine learning used to help you guys assign this. - Yeah, this is the first time that we've actually at Intel used machine learning to do placement and routing. So think of it as you're now using AI to help develop the chips that are gonna run AI. It's getting to the point where the blend between engineering and AI is shrinking and we're learning quickly how can we as engineers use AI to deliver better products more quickly.
- So layout design is just getting a lot better and more efficient. - Oh, yeah, but it's not just layout. I mean, AI is being applied to all aspects of chip design, all aspects of software design, and it's accelerating.
So think of it as starting with things that are usually very manual and very painful, like layout. I mean, there's automatic place and route tools. But they typically need a lot of manual intervention to get just right. Now AI is doing a lot more better results, a lot quicker results with less human interaction. - So you also have a larger GPU con configuration, right? So the building blocks in theirs are are bigger. - Yeah.
- So what is the consistent there? - Well, what's the growth versus the prior generation? The way I think about it is it's about 30% bigger than prior generations. That means that we've dedicated more area to getting higher performance. And so it's more EUs, it's more vector units, but more importantly, it's now modernized.
It's all using our Xe cores, which are borrowed from our X-HPG architecture that's present in our Alchemist architecture. So we've taken all of that new optimized vector engine and moved it into the Meteor Lake graphics blocks. - So I'm gonna confess, I don't know very well what a vector engine is. - Oh, okay. - Can you help me with that? - I can, I can.
So what's cool about graphics is it is embarrassingly parallel. This is a old boss to me used to use that term, right? And what that means is that you're doing the same computation over and over and over with different data. And you can think about it as primarily we're trying to calculate the color of a pixel, but you have lots of pixels.
So really all we need to do is the same computation many, many times with different data. Okay? That's a vector. So think of it as that long collection of pixels as a vector.
And the instructions are the same for every computation, you're just using different data. So a vector engine takes one instruction and then does many different slices of computation that are all using the same instruction, but different data. It gets more complicated with AI. So you can think about, we also have something, well actually in DP4a is a way to think about it, we have a vector, which is normally 32 bits wide, but AI doesn't normally need 32 bit floating point.
You can actually do it with a much smaller piece of data, which is called integer eight. It's just eight bits. So we actually, using this instruction called DP4A, can slice up the vector engine into four chunks, each eight bits, which dramatically improves our AI performance by 4x. - Oh wow.
So DP4a is an instruction that is used usually for AI. - Yup. - That doesn't need that much of a accuracy, therefore you can chop it into- - Yeah, you don't need 32 bit precision. For a lot of AI it's very much AI is this very approximate calculation, and you don't really know if you're right. You don't really know if you're wrong. So being a little less right, maybe is not that important and you can normally use much less precision in your calculation.
And that's what DP4a is all about. It allows us to chop up that 32 bit vector into eight bit chunks, improving performance by 4x. - Okay so ray tracing. - Yeah.
- Ray tracing? - Ray tracing is cool. Well, we are obviously bringing ray tracing to Meteor Lake first time on integrated for us. And that's a brand new technology. And the way to think about ray tracing is it's not traditional. You're not doing like a vector operation. You have this problem, which is a called a BVH traversal.
And A BVH stands for bounded volume hierarchy, which is how you describe nests of things. Think about those Russian eggs where you kind of put, you know. And what your job is, is to find the piece of geometry that a light source hits. And rather than saying, "Hey, does this light source hit every triangle?" That's very expensive 'cause you gotta search all triangles. You build boxes around groups of triangles.
And so the first step is, "Hey, which box did I hit?" And then within that box, "Which sub box did I hit?" And you're going through this hierarchy of boxes trying to get down to the leaf, which is geometry. And the reason that's there, it's called an acceleration structure. And what that allows us to do is many rays can be tested at the same time, finding eventually these geometries so that we can determine the color of a pixel. - So you have many rays coming in at the same time, just to figure out which one is the one that you're actually using, and then- - Well, the way to think about it is we have, in the model that defines the game, so there's a model that says there's a light source here and there's geometry over there. What you might wanna do is say, "Pick what geometry does this light source illuminate?" And so what you would do is you would send a ray in some direction in 3D space, and you'd go through these bounded volume hierarchies, and you're going through this tree trying to get down to a leaf, which is a geometry. And once you get to that geometry, you do another calculation that says, what color should this geometry be based on that light? So imagine doing that billions of times or millions of times.
And now suddenly the scene is illuminated in a realistic way, and it's illuminated 'cause you've cast many rays from each light source and they've gone all over the place. And they've now created effectively a truer representation of what light would really do in a model. That's very different from how traditional games work, which is called rasterisation. Rasterisation, you can think of it as painting a scene. You're giving instructions to an artist.
You're saying, "That's gonna be one particular color. It's gonna play some tricks with lighting, play some tricks with shadows." Ray tracing or ray casting is a very different technique that more accurately reflects what happens to a real photon, for example. - So it's pretty impressive 'cause you said these are like millions of calculations at the same time.
And to do it real time, like for in a game that requires a lot of power. - It does, and right now we're just at the beginning. So the way to think about Meteor Lake and ray tracing is it's definitely present, but games have to figure out how can they use ray tracing in a small way, right? Because Meteor Lake is not going to be designed to have billions of rays, it's smaller. And this is a big new horizon for game developers having native ray tracing on hundreds of millions of devices. We don't even know what they're gonna do.
We know, for example, that it's not just for games. Ray tracing is a technique of casting rays through any model. You can use that for simulating sound. You can use that for doing better Blender. So if you're using Blender, you can use our ray tracing hardware to get better renders inside of Blender. So it's really a very new thing and it's just enabling the software ecosystem.
- That's pretty crazy and pretty great that it's just starting. - Yeah, yeah. - There's more stuff coming up. - Oh yeah. When you talk about Intel, our strategy is deploy large volumes of hardware enabled platforms and then work with the software environment, the software ecosystem in an open way to help them leverage the platform that we've built, right? That's true for graphics, it's true for AI, and it's true for pretty much any other intel technology. - Oh, that's great.
All right, so I'm gonna talk about ray tracing. There's another very cool technology in there, which is Xe Super Sampling, XeSS. - XeSS, one of my favorites. XeSS is an AI-based technology. And the idea is render at low resolution and then use an AI network to super sample it. And basically that means take a low resolution image and create a high resolution image.
But what you can do is you can train that AI network using billions of images or maybe millions of images. And what you're gonna do is it's gonna learn what does a good high resolution image look like, given a low resolution input and maybe some other data like motion vectors. And by using that AI, you can lower the power per frame to generate these high quality images.
That's what primarily we're using on Meteor Lake. You can actually save a bunch of power by rendering at low resolution and using AI to super sample. - Oh, that's, yeah, I've seen several demos though, and it is great. - It is really cool. And I think it's also just, again, the beginning, right? We're talking about what I think of as neural rendering technologies.
And for a platform like Meteor Lake, when we deploy a hundred million of these things over time, you're gonna see a gigantic installed base of AI enabled platforms. So what's gonna happen with all that? How are people gonna use that? What other rendering techniques are gonna evolve? Just, we'll see. - We'll find out. All right, so we've been talking about the hardware side of, but we know that hardware without software, without drivers, without the software stack, it, I mean, we need both of 'em, so- - Yeah, well, I mean, many people probably know this already, but we have a gigantic effort on drivers, and it's because we're solving a very, very hard problem. You have to support all games over all time, DX9, 10, 11, 12, Vulcan and future, right? And all games have different usages of those APIs, which creates this very large and complex matrix of compatibility problems, but we're investing in it. We have a team that does compatibility.
We have a team that's working on how can we develop our driver to make it more stable, given all the variation in games. And we've made dramatic improvements. Our performance is getting better every week practically, and our compatibility is really dramatically improved over the last years. And now all of that effort can actually come to integrated for the first time. - Now it's great, and I've seen that the advancements throughout the year since you guys started, and also that the team and all the effort you guys have put into it, it's been great. - Oh yeah. But you know, there's a lot of, not only are there a lot of unsung heroes on our driver development team, but there's things that we're fixing that nobody's even aware of, right? And it's kinda where my passion play is.
I care about the experiences that we deliver. And sometimes those are not measured in just FPS, it's things like lower latency and it's smoother gameplay and better visual experiences. So I like to think about, "Hey, we're not just doing number crunching. We're delivering experiences to end users," and they have eyes, and those eyes are physiological devices that have some things they're sensitive to, some things they're less sensitive to.
So one of the things we do is optimize the experience, given the fact that it's playing to a person. And there's like lots of massive science there. I don't wanna go into too much detail about it right now, but that's where my heart is. - The other thing that I saw that was new and on one of the conversation presentations you were doing is endurance gaming.
- [Tap] Oh yeah. - What do we mean by endurance gaming? - Well, endurance gaming is another one of those really interesting areas of development. The idea is, I know I'm playing mobile, so I'd like to somehow balance the experience that I'm having with the fact that I wanna save battery life. So endurance gaming is all about getting the best experience that you can within a power budget, which is slightly different from how you normally think about it. Normally you want to take all of my power and get the best performance I can. But with endurance gaming, we're kind of flipping the script on that.
Gimme the best experience you can get at a particular power level. Today, it uses a combination of driver-based frame limiting. So we're gonna say don't go faster than a particular frame rate, but it also connects into our SOC.
So there's all kinds of power management happening on the CPU side that effectively cooperates with our GPU to get the right sort of experience versus power. You can almost say experience per watt, and it's very, very cool technology. - So it's not just a graphic, it's a whole platform. - It's a platform optimization. And endurance gaming, I would say is again, one of those things that's gonna evolve for decades, right? It's a very hard problem.
I expect you'll see AI deployed to that in the not too distant future where maybe AI can do a better estimate of what particular power level should the CPU be at, what particular power level should the GPU be at? I think that's a really fertile area for future development. - Yeah, that's great. Tap, thank you so much. This has been great. I learned a lot. I really appreciate it. - Yeah, well, it's always cool to be here, yeah. - Well, thank you so much. Alex, hanks a lot.
- Thank you. - Yeah. (driving electronic music) - As you can see, Meteor Lake from an architecture point of view and also from a product point of view, has a lot to offer from new AI, new graphics, new process technology, a completely new type of architecture where we go from monolithic to disaggregated, brings a lot of new features. Thank you for watching with us, and please stay tuned for more video that will be coming in your way.
(driving electronic music) (no audio) (bright chiming music) (no audio)