These days ray-tracing is one of the most popular buzz-words in the gaming and computer graphics industry. But what exactly is it, and how does it differ from traditional rendering. That’s what we’ll explore in this post.
Ray-tracing isn’t exactly a new technology. It’s been around for decades. However, as it took days to render a 10 minute video using it, it was left to animated movies and TV shows, rather than real time rendering which is what games utilize. Ray-Tracing essentially means casting a bunch of rays from the screen, through the 2D image (imagine it as your screen) and then into the 3D scene that is to be rendered.
As the rays propagate through the scene, they get reflected, pass through semi-transparent objects and blocked by various objects thereby producing their respective reflections, refractions, and shadows. After this, they return to the light source with all that data and the scene is rendered.
Neat isn’t it? Except, it’s very performance intensive. To produce an image with reasonable quality, there needs to be a ray for every pixel on the screen. And that is a lot of rays which requires an unreasonable amount of graphics processing power. That is the reason, most games utilize something called rasterization. It’s a crass hack that works pretty well in creating a near realistic imagine without the need to caste a dozen million rays.
What is Rasterization
So now, what is rasterization? In simple words, rasterization is guessing which objects will be visible on the screen, along with the shadows, reflections and the other do-dads that are present in modern games. To do this, the Z-buffer (or depth buffer) is used which is basically the z-axis.
The vertices of all the objects in the scene (in the form of triangles or polygons) are rendered and placed along the Z-axis according to their distance from the Depth Buffer or the screen. Then the outlines of the objects are extrapolated and those polygons (or parts of objects) that are behind others are culled or removed. This is called hidden surface removal. This saves a lot of processing power. Then the various effects such as shadows, reflections, AO, etc are produced for the remaining image and the scene is rendered.
Here there is no ray-tracing. The shadows are pre-baked into the scene using shadow maps which are basically guesses of where they should lie. When it comes to reflections, there are many ways to render them. The two most popular are cube-maps and screen-space reflections. In cube-mapping, the scene (texture) is usually pre-baked onto the six faces of a cube and stored as six square textures or unfolded into six square regions of a single texture.
Screen space reflections or dynamic reflections are more advanced and performance-intensive. In this technique, the various objects in the scene are re-rendered on transparent surfaces (the resolution will depend on the quality). However, the main drawback of SSR is that the objects that aren’t present in the screen but should be reflected nonetheless aren’t rendered.
For example in this image, the pipes on the ceiling aren’t visible on the screen, but their reflections should be visible on the water below, but it’s not. The broken-down car, however, has its reflection rendered because it’s present in the 2D scene. A similar technique called Screen Space Ambient Occlusion is used to render the ambient shadows but it also suffers from the same drawback. The shadows of objects not rendered on-screen aren’t drawn.
NVIDIA RTX: How Ray-Tracing Works
We’ve explained how ray-tracing works above but let’s get into the details. Here are the main steps involved in ray-tracing in the gaming industry:
Ray-Casting: This is pretty self-explanatory. The rays are cast into the scene through the pixels and the ones that hit an object (polygon) are considered. The information is calculated from this ray as it reaches the light source such as the distance of the object from the screen, possible shadows, reflections, refractions, etc.
Boundary Volume Hierarchy: BVH is used to make the whole ray-tracing process more efficient. Here, all the objects on the screen are enclosed in boxes of sorts and then the rays are cast. Only the boxes that are intersected by the rays are considered. Instead of casting a ray for each polygon (which would be several million), a ray is cast for every BVH object. The RTCores found in NVIDIA’s Turing graphics cards accelerate this very process. The rest of it isn’t all that intensive.
Denoising: Remember I said that there needs to be one ray for each pixel or possibly more? Yeah, present games that leverage ray-tracing use lesser than an ideal ray count (because boo-hoo we’re still not that advanced). This results in blurring:
It’s very faint here because this image is already denoised, but because there is only one ray per pixel (or perhaps less), there isn’t enough data to fully render the reflections. This blurring is taken care of by a technique called denoising. It’s a complex method used to get rid of artifacts and low-quality blur using a variety of algorithms such as by approximating the color of a pixel, similar to upscaling.
There’s also something called path tracing. This is ray-tracing but a more intensive form of it. Here the ray travels from the camera and then bounces off several objects before returning to the light source. Because there is an insane number of rays in the scene at a time due to the repeated bouncing/reflecting, this is quite taxing, but at the same time, the light effects are more precise. Let us know if you have any other questions regarding ray-tracing. Cheers!