Modern graphics programming revolves around achieving high-performance rendering and visually stunning effects. Among OpenGL’s capabilities, Multiple Render Targets (MRTs) are particularly valuable for enabling advanced rendering techniques with greater efficiency.

With the latest release of Mesa 24.03 and the commitment from Igalia, the etnaviv GPU driver now includes support for MRTs. If you’ve ever wondered how MRTs can transform your graphics pipeline or are curious about the challenges of implementing this feature, this blog post is for you.

Understanding Multiple Render Targets (MRTs)

At its core, MRTs allow rendering to multiple images or “render targets” simultaneously during a single rendering pass. These render targets are buffers or textures that store various scene data such as color, depth, or normals. By writing to multiple targets at once, MRTs enable developers to:

Enhance efficiency by reducing the number of rendering passes.
Implement sophisticated rendering techniques, such as deferred shading, which decouples geometry processing from lighting and shading.

Here’s a simple OpenGLES shader example demonstrating how to use MRTs:

// Vertex Shader
layout(location = 0) in vec2 inPosition;

void main() {
    gl_Position = vec4(inPosition, 0.0, 1.0);
}

// Fragment Shader
layout(location = 0) out vec4 fragColor1;
layout(location = 1) out vec4 fragColor2;

void main() {
    fragColor1 = vec4(1.0, 0.0, 0.0, 1.0); // Red
    fragColor2 = vec4(0.0, 0.0, 1.0, 1.0); // Blue
}

This code renders two render targets: one red and one blue, by declaring two output locations in the fragment shader and writing different colors to each of them.

How MRTs Are Used

MRTs play an important role in graphics applications. Here are some of their most common use cases:

1. Deferred Rendering

MRTs are the backbone of deferred shading, a technique where scene geometry is rendered to multiple targets, storing data like positions, normals, and albedo. This data is then processed in a second pass to calculate lighting, enabling advanced effects like dynamic shadows and screen-space reflections.

2. Post-Processing Effects

By writing intermediate results to multiple targets, MRTs facilitate a wide range of effects, including bloom, depth of field, and ambient occlusion.

3. Debugging and Visualization

Developers can output different types of scene data simultaneously for debugging or creating visualizations.

Implementing MRTs in etnaviv

Adding MRT support to the etnaviv driver involved significant reverse engineering and experimentation. Here’s a look at some of the challenges and solutions:

Reverse Engineering the GPU

The reverse engineering process begins with examining the limits exposed by the binary blob driver, such as the value returned by GL_MAX_DRAW_BUFFERS. Different Vivante GPU generations (HALTI) support varying numbers of render targets, requiring repeated testing and analysis.

Lay the foundation

I started with Freedreno’s test-mrt-fbo to get a rought idea of what needed to be done. Hours later, I identified most of the necessary bits and GPU states I had seen in the command stream dumps generated by the proprietary driver.

The first goal was to get a very basic piglit MRT test working on the GC7000 (HALTI5) GPU, rendering correctly with etnaviv. In this phase of reverse engineering, I usually hack around in the driver and my git commit history is full of ‘hack/wip’ commits. This helps me keep track of the (breaking) changes I make during the process.

Looking at the changes I had made until here, it was clear that I had touched nearly every part of the gallium driver. It started with compiler changes to handle the extra color outputs and ended with the extra MRT states that needed to be emitted. And, to be honest, at this stage I also had broken some other CTS and piglit tests, but that was something to be taken care of later.

When I tried some more complex piglit tests that used sparse render targets, I saw that they failed and realized, after reverse-engineering the proprietary driver some more, that I needed to work on remapping some information provided by NIR in Mesa about fragment shader outputs into compressed info the driver needed. That didn’t make all relevant Piglit tests pass, but I was getting close.

Will it work on another GPU?

The branch was then tested on older Vivante GPUs, like the GC3000 (HALTI2). None of the tests passed initially, as state emissions differed and the maximum MRT count was lower (4 vs. 8). Updating the state emission logic addressed these issues.

Whats wrong with the Tile Status?

As I always want to provide the best experience for all etnaviv users, I wanted to support this shiny new feature even on a much older Vivante GPU generation - GC2000 (HALT0) found in i.MX6 boards.

My hopes where high that my git branch would just work but, as usual, I was wrong. The traces from the vendor driver showed something interesting. There was no tile status (TS) usage found in the traces and I could confirm piglit was happy when I used ETNA_MESA_DEBUG=no_ts.

You might wonder that the ominous Tile Status might be. Let me give you a quick summary.

A render target is divided in tiles, and every tile has a couple of status flags. An auxiliary buffer - the so called Tile Status buffer - associated with each render surface keeps track of these tile status flags. One of these flags is the clear flag, that signifies that the tile has been cleared. For example, a fast clear happens by setting the clear bit for each tile instead of clearing the actual surface data.

I found a way to fix this problem too and then moved to CI.

CI to test them all

If you’ve followed me until here, you know that I worked on one HALTI and moved to another and never tested if I had broken any of the other HALTI’s. It was time to let etnaviv’s CI do its work and, as expected, I discovered I had broken HALTI5. After a local debugging session I got it into a working state. While I was very close to being ready to submit an MR and incorporate review feedback, there was one last thing I need to take care of…

HALTI5+ Enhancements for MRTs

It turns out that a HALTI5 GPU can support more OpenGL extensions as it has even more GPU states that fall under the MRT umbrella. The MRT ground work that was already in place allowed me to implement the following ones:

1. GL_EXT_draw_buffers2

This extension allows independent blending and write masking for each render target in an MRT setup. Developers can specify unique blending equations and masks for each target, offering unparalleled control over how data is combined.

2. GL_ARB_draw_buffers_blend

This feature extends blending capabilities even further, enabling per-buffer blending equations and functions. It’s especially useful for advanced rendering pipelines, such as deferred shading and post-processing, where different render targets may require entirely distinct blending behaviors.

Conclusion

With the addition of MRT support and the powerful HALTI5+ enhancements like GL_EXT_draw_buffers2 and GL_ARB_draw_buffers_blend, the etnaviv driver has reached a significant milestone.

MRT support is also a key feature for achieving full GLES3 compliance, marking a step forward in modernizing the capabilities of the etnaviv driver.

For detailed reverse engineering results, check out this repository, which uses the rnndb format to describe GPU states and bits. The MRT specific changes can be found in this commit.

What’s next? Stay tuned for more updates on the etnaviv driver as we continue to push more features upstream.

Understanding Multiple Render Targets (MRTs)#

How MRTs Are Used#

1. Deferred Rendering#

2. Post-Processing Effects#

3. Debugging and Visualization#

Implementing MRTs in etnaviv#

Reverse Engineering the GPU#

Lay the foundation#

Will it work on another GPU?#

Whats wrong with the Tile Status?#

CI to test them all#

HALTI5+ Enhancements for MRTs#

1. GL_EXT_draw_buffers2#

2. GL_ARB_draw_buffers_blend#

Conclusion#