How Quake Failed their way to Success

Show Video

Growing up, Quake was one of those games that just blew me away with what a computer was capable of. It helped inspire me to become a developer in my own right. One of the first books I ever got on coding was the Graphics Programming Black Book by Michael Abrash, although I confess that most of it was over my head at the time. Released in 1997, this book isn’t as relevant as it once was, but it holds stories and insider info about the development of one of the most groundbreaking games ever made. Today, I’d like to dive into one of those stories, but with fresh eyes after more than 20 years of additional experience. So visibility determination isn’t easy,

even today, as I showed with my recent deep dive into current methods. It remains very much an area of active research and development. Rewinding the clock back to the mid 90’s, and we go back to a time when this field was very much in its infancy, comparatively speaking. The Quake dev team faced this very problem. The game was unplayably slow,

and since they were breaking new ground, there was no clear solution to turn to. When we talk about visibility determination, for the purposes of the video, we’re talking about what goes into drawing a pixel on the screen. So you’re down here, this is the camera, and you’re looking off in this direction, so what we need to know is what’s visible to you. In modern games, this mostly boils down culling out what’s off screen, which is called frustum culling, then doing some occlusion culling to figure out what objects are behind other objects, so we can cull those as well. But back in the 90’s, this was being done in software on a desktop computer that would choke on just playing a song today, something we all take for granted as background work. Quake originally required a pentium 75, which had a processing speed of approximately 126 MIPS, or million instructions per second, according to google. Take note that this is instructions per second,

not floating point operations per second, which is what you might see on more modern hardware. By comparison, the phone in my pocket is capable of somewhere in the realm of 1142 GFLOPS, or 1.1 Million mflops, if I haven’t messed up a decimal somewhere, so the gulf in computing power, even though this is a silly comparison, simultaneously, it's’ good for younger viewers to have a grasp of just how much of a potato that thing was compared to anything you have now. Not to take anything away from what it was, in its day. The cost alone of simply rasterizing a triangle was high, so you needed to be fine grained, and only draw the polygons that were visible. There was also no gpu in any modern sense, as in a gpu that you could offload a significant amount of the work to. No depth buffer, no shaders, not even hardware T&L, like you had nothing.

So you’re working with a few more constraints than today when doing your visibility determination, you don’t want to simply discard the objects that aren’t visible, that’s still way too much. You want to get even more granular than that. In an ideal world, you’d somehow know just the polygons that are visible, that contribute actual pixels on the screen. In an ideal world. And of course, that magical test,

whatever it is, has to be fast. Quake stored its levels in a BSP, or binary space partition, which is a data structure you don’t run into much these days. Conceptually, they’re not especially difficult to grasp. Let’s take a simple 2d example, and we’ll build a little tree to illustrate what they do. So right now, the space, the level, whatever you wanna call it, it’s empty. We can add a wall, let’s call that wall A, and the wall has an

orientation, like this way is forward, and obviously the other way is backwards. The wall creates a splitting plane, so that forward, or in front of the wall, we’ll call it empty space, while behind the wall, this is solid. This is an important thing to take note of, because it’ll be very relevant throughout using the BSP. So we’ve defined the wall, let’s define a node, in this case the root of the BSP tree, we’ll call it A. This node has 2 children, the left node represents what’s in front of A, which is empty space, while the right is behind A, which we’re calling solid, this represents the region behind the wall. One wall isn’t terribly interesting, so let’s

add a 2nd wall now, and for simplicity I’m just going to make it down here, like this. So this is wall B, it faces in this direction. We have to insert this wall into the BSP, so starting at A, we have to check, is this wall B in front of, or behind A? We know it’s in front of A, so we’re going to go to A’s child, the empty space node, and replace it with a B node. And this new node of course has a left child of empty space, and a right child of solid. The left child of B represents this region here, notice how it's the region in front of A that's now being split by B. The left child of B represents the region in front of B, while the right child represents the region behind B. Let’s repeat the process now, but with a horizontal wall C on top, facing in this direction. So C is in front of A, and it’s in front of B,

so it get’s inserted here, replacing B’s left child, and so you have this new node C, with a left empty child, and a right solid child. And I’ll leave it to you to do this on paper, but if I were to add this new wall D, which faces in this direction, then it’d end up here as a child of C. So we end up with this kinda boring BSP tree, that’s basically just a line of nodes descending to the left. But the really cool part of BSP’s is they enable you to do fast operations. Let’s bring in a slightly more complex BSP,

just to make the test’s a bit more interesting. This one is basically the same, but I’ve inserted a few more walls. If you’re following along on paper, take special note that I just inserted them in alphabetical order, and notice how E is in front of D, but when I go to insert F, it’s behind E, thus on the right. Now, let’s test whether a point is in empty space, or in solid space. This point here, very obviously by looking at it, is in empty space, but we can confirm that with like, actual code trivially. Starting at the root node, is this point in front or behind the splitting

plane? In front, obviously, so we move to the next node, and it’s in front of that, so again, we move on. And we repeat that process, over and over again, until inevitably we end up in a leaf node, either a solid node, or an empty node. In this case, an empty node, signifying that this point is in empty, or free space. Or in other words, if that’s a player, they’re allowed to be there.

So BSP’s can accelerate these kind of tests, they allow you to rapidly discard large sets of the world. But another interesting property is that each of these nodes here. Each of these represent a portion of the space. They, in fact, split the space of their parent. So starting at the root node, this is all space. This node’s left child is one half of the space,

while the right child is the other half. When we look at this node, B, it partitions this space here, meaning that it’s left child is this space in front of it’s splitting plane, while it’s right space is this space behind. And we can continue in that way, looking at C, this node is splitting this highlighted region here along the plane that we provided, meaning that the left child of C represents this space over here, while the right child of C represents this space over here. And likewise, we can continue on and on,

and these nodes carve up the space into these ever finer convex subspaces, and we end up with these 3 convex subspaces that i’ve coloured, which correspond to these empty leaves in the tree. Which is kinda neat. That’s one of the big takeaways, bsp’s carve up space, and you can of course calculate bounding volumes for these subspaces if need be, and having bounds on these nodes allows you to rapidly discard entire portions of the world, making bsp’s useful not only for rendering but for other gameplay systems like physics. Lastly, BSP trees give you a natural ordering when you draw. They can be used to easily give you surfaces in front to back or back to front order.

If we take a really simple bsp tree, we’re just going to have a couple lines here, so you could follow along on paper if you wanted to. And we’re going to place our eye here. Now the interesting thing with the bsp is, the direction doesn’t matter, only the position of the eye. So to draw, the algorithm is pretty straightforward, you walk the tree and at each node, decide to either draw the left or right child first. So you start at the root, if you’re in front of the splitting plane, draw the back of the tree first, then this node, then the front. If you’re behind the splitting plane, do the opposite. Then of course you recurse and do the same, so you you go through nodes B, C, and D eventually. So that gives you immediately the order B, because that’s behind, then A, because that’s the node in question, then you have to draw this subtree, so we recurse, we’re in front of C, so we draw behind C, which is nothing, then we draw C, then we recurse, which just ends up drawing D last So we get easy back-to-front order by simply walking the tree, easy peasy.

So they give you quite a lot of advantages, but this isn’t quite enough to draw effectively because of one major problem, overdraw. Overdraw is a problem that’s still being tackled on modern hardware 30 years later, and in Quake this was really serious. This was being done in software, and although they did implement a z-buffer, it had a performance impact and was limited to dynamic stuff. If you draw front to back, in some terrible cases,

you may end up drawing and redrawing the same pixels over and over again. You could easily imagine cases where overdraw could reach 10x or more. So this was the big problem to be tackled, how do you reduce, or even eliminate overdraw? This part in the story was one of the most interesting to me, the sheer number of approaches Carmack took, attempting to solve this. He attempted to use a beam-tree to track visible

fragments, a subdividing raycast approach, almost like binary search raycast, representing the world with planes rather than explicit vertices, a 1 bit draw buffer tracking whether or not a pixel had been draw yet, rasterizing the polygons into spans and clipping using those, and an approach that involves tracking the portals between leaves, and clipping further away geo and portals to these portals. I won’t dive into all of them, but let’s poke at the first couple of them. The first thing mentioned, the beam tree is really interesting, because finding information online was difficult, references are pretty sparse or result in dead links. There’s some scattered references to some

papers back in the mid 80’s that you can read, which gave me a bit of context, and there’s a very brief explanation in the book as well. From what I understand, they’re in essence a special type of bsp tree, but instead of being built directly from the polygons of the world, you do something very different. If you’re looking out at the world, your camera would be down here, and the viewing volume that you see is formed by 6 planes. You’ve got your planes on the left and right, you’ve got your planes on the top and bottom, and you’ve got your planes on the near and far. Now let’s consider a single triangle out here in the world, and we have the camera looking at it. So, if you take an edge of that triangle to begin with, and you form a new triangle using those vertices on the edge, and the camera’s position. From this you can form a plane that passes through the camera and the edges of

the triangle, similar to the viewing volume. You can then repeat the process for each edge of the triangle, building another plane that passes through those 2 vertices and the origin. So this is called a beam, this kinda volume that’s projected out from the origin to the triangle, and you’d insert this into a bsp, not the whole thing, only the planes that you calculated, the side planes. So let’s visualize this, so this is our screen, and you’d start the beam tree off by inserting 4 planes representing the bounds of the view frustum, the visible area or the edges of the screen. And you’d end up with a pretty basic tree,

kinda similar to what we saw earlier. We’ll colour the “drawable” area green, so anything in the drawable area is good to go. So the basic idea is to use this secondary BSP to check what’s been drawn on the screen. So let’s say I want to draw this triangle here, you can tell from a glance that this whole triangle falls into the “drawable” area in the beam tree.

What would actually happen is that you’d insert this triangle into the beam tree, by inserting each of the planes of the triangle, which in 2d basically look like lines. So one by one, those would get inserted into the tree, and you’d end up with a tree more like the one on the right now. These open leaves of the beam tree correspond to regions that are still drawable, which you can go test on paper if you like, but I’ll leave that to you and we’ll move on. Let’s say now, I want to draw this second triangle, we know by just glancing at this that if we were to walk the BSP tree, we’d end up in this region here in front of F, which corresponds to this leaf on the bsp tree itself. As you prepare to draw this triangle, you walk the beam tree and if it ends up in a visible area, you’d update the tree and draw the triangle. But if it doesn’t, there’s no tree changes and thus the polygon isn’t visible. That’s my understanding of it anyway. In theory, a great approach, but as Abrash

notes, the worst case was still terrible, and maintaining the beam tree wasn’t free either, so in the end they had to scrap the approach. Another interesting approach was to use raycasts, since the bsp is great at accelerating those. So by this point, you should be reasonably familiar with the idea of walking the tree to figure out whether a point is visible or not, it just involves descending and doing the plane tests until one reaches either an empty or solid leaf. So a raycast isn’t really all that different, except you’re of course dealing with a line. If you were to say, overlay the line over the bsp,

you can clearly see where it would get cut by the splitting planes and chopped up. So you’d end up with a bunch of fragments of the original ray, each fragment lying in either a solid or empty node. Of course, you don’t need to chop it by every plane though, that’s just wasteful. You can be smart and do the closest section first, you can see this section, if you walk it down the tree, it ends up in an empty node, and thus it never hits anything. So we go ahead and process the other side, and this part of the ray of course gets chopped by this plane here, so of course what you’ll do is process that part first, so we sorta can clearly see that it’ll end up in an empty node, and thus hits nothing. This next part, behind the wall, well, we KNOW that it hits something, like us the viewers, but we need to prove it in code. Well,

the next node down the tree is this node here, with this splitting plane, so the ray gets lopped into 2 parts, and we’ll take this part here and walk it down the tree. And of course, it ends up in a solid node, so we’re done, we know for a fact this intersects something, we know where it intersects, we know everything. So if we take a look at an example screen, the subdividing raycast idea, at least in my head given the description, kinda works like this. You divide the screen into some number of areas, they mention use an 8x8 grid, so 64 in total, and for each you cast a ray, using the exact same raycast approach to clip the ray against the BSP. Let’s take a look from the side, so here’s a side view, here’s the camera here, and we’ll have some walls here to fill out the scene a little bit. So what you’d do is you’d

fire out a bunch of rays at the world, which of course would get clipped against the BSP as I showed before. Let’s assume that every ray hits something, so each of these rays is being extended until they touch a surface. The question would then be, what did they hit. We’ll colour them with the colour of the surface they touched. Now, let’s focus on these 2 rays. If 2 rays are adjacent, and hit the same polygon, then you can simply interpolate the surface in between, easy enough. That’s a best case scenario, so now if we were to focus on another pair when they don’t hit the same surface. So these neighbouring raycasts

hit different surfaces, so then you in essence have to do a binary search by firing a ray in between and repeating the process. Then it kinda follows that eventually, after repeating this process a buncha times, that either all the adjacent rays will be on the same surface, or they’ll be at pixel level, at which point you bail. I could easily imagine this working great in one of those old Quake levels, where polys took up a lot of real estate on screen, and chances were pretty good that just a few iterations of this subdivided raycast approach could cover most of the screen. From the sounds of it, this ultimately failed due to dropouts, basically problems with tiny polygons. There were other approaches, as Abrash mentions, some were merely kinda tossed around, while others saw actual implementations. I’m guessing based on the descriptions that the draw buffer and span buffer approaches, since they seem like iterations on existing techniques, were tried out, as well as the portal clipping, but the vertex free surfaces idea, maybe something they just punted around, who knows.

Here’s the point in the story where we have the miraculous turnaround. If this was an episode of House, you’d have him look at a bus going by, or a leaf falling, or something random, and suddenly have an epiphany. According to Abrash, we’ll just read his words here: When I came in on Monday, John had the look of a man who had broken through to the other side--and also the look of a man who hadn’t had much sleep. And then he goes on to describe how Carmack figured it out. The solution being precalculating the visibility between leaves in the bsp. This would be called a PVS,

or potentially visible set, which now, nearly 30 years later, how it works is widely known, but it was a real game changer in its era. The general idea is that, let’s say that we’ve got our level here, we’ll just do a 2d version instead of a 3d version, and let’s just quickly understand what we’re looking at here. So each of these regions here, let’s say these regions are leaves in a BSP. We’re taking some liberties here, but let’s run with it. Then each of these spaces is separated by these green lines, these would have been splitting planes, and you can figure out that where there’s no polygons, those form a sort of border between adjacent spaces that are called portals. So given this shaded region here, of course

any region that it’s immediately adjacent to it, is, of course, visible right away. You can figure that out without a lot of effort using the BSP since they basically share a portal. The trick is really whether or not these further away regions are visible from that initial region. I mean, you could do something like cast a zillion rays from one portal to the next, and see which other leaves they manage to hit, but that’ll be really unreliable. Quake did this by erring on the side of being conservative. Secondary neighbours are usually

visible, not always, but pretty much most of the time. So we’ll consider a neighbour of a neighbour to be visible right away. So the effort was mostly on leaves further out than this. The question now is whether section 4 here is visible from section 1. The approach they employed was by using a separating plane, so they’d create a plane from each edge of the portal, to the opposite side of the other portal, here we have our separator plane, and the portal lies completely in front of it, so it’s not eliminated yet. We’ll do the separator plane for the other side, and lo and behold, the portal lies completely behind the plane, getting culled out. In a small way, this is just like the beam tree we just talked about. If we were to say, expand section 2 so that the

portals end up a bit fatter, you can see that now the separating plane just catches the end of the portal from section 3 to 4, making section 4 visible from section 1. That's the general idea, and it extends pretty easily into 3d from there. In 2d, of course we only needed the 2 separating planes, but in 3d you would do this for each edge. As I said, it’s a conservative approach, but works incredibly well. BSP’s have since faded from their glory days, you don’t really hear much about them anymore.

Levels now are dynamic, requiring a degree of flexibility that’s necessitated new approaches. The original Quake source code was released over 24 years ago at this point, and that, along with subsequent sequels, have been a source of some incredible gems. Perhaps, most infamously, the fast inverse square root function, which is this abomination you see before you. This was a story that for some reason or other, has stuck with me for over 20 years. I read this

at a time when I was still struggling with basic concepts, and it shaped, at least in a small way, how I approach things today. I find that it’s really easy to be enamoured with a clever solution, I mean we all appreciate an elegant answer to a problem, but it’s also in the failures, the missteps and dead ends along the way, that we can find some really valuable insights. The final solution, at least to me, evolves naturally from the failed attempts that preceded it. Understanding the beam tree, the portal clipping ideas, a lot of the basic ingredients were there.

I remember, I had just barely started learning to program and struggled hard with the sparse information Abrash provided. Looking back now, I actually appreciate how it pushed me to work harder, to really strive to understand the underlying concepts. If you think you’d like to learn more about gamedev from an experienced graphics engineer, I have a variety of courses available, very suitable for beginners, so if, for example, you ever wondered if understanding quaternions was possible, check them out. Otherwise, if you’d still like to support me and help choose what topic I cover next, you can vote on my Patreon page. Cheers

2024-04-19 14:04

Show Video

Other news

TecHype Special: Robert Reich Interview/DeepDive (full episode) 2024-11-19 04:15

Whatever Happened to Millimeter-Wave 5G? 2024-11-16 13:51

The Tragedy of Compaq 2024-11-13 11:31