HDR, adaptive tone mapping and MSAA in GLSL

Question

In an effort to teach myself OpenGL, I am working my way trough the 5th edition of the Superbible.

I am currently trying to figure out how to combine HDR and MSAA (as described in chapter 9).

For HDR, the book suggests a method for adaptive tone mapping that is based on calculating the average luminance for a 5x5 convolution filter for each fragment.

For MSAA, the method used averages all samples by weights calculated from the sample distance.

My attempt at combining both, found in the pastebin below, applies tone mapping to each sample then averages them to compute the final fragment color.

Performance is (as one should perhaps have expected ?) terrible: at 25 lookups per sample, times 4 for 4xMSAA, I'm guessing the GPU spends much of its time looking up my FBO texture. Switching to the code path controlled by the use_HDR uniform in the code drops performance frop 400+fps to under 10 for a simple scene.

My question is twofold:

is this a sane method of performing tone mapping ? If not, what would you suggest ?
how should MSAA and convolution based filters be combined ? I'm guessing I'll have this problem again for any filter that needs to look up neighboring texels, i.e. pretty much anything like bloom, blur, etc ?

Code:

#version 330
in Data
{
    vec4 position;
    vec4 normal;
    vec4 color;
    vec2 texCoord;
    mat4 mvp;
    mat4 mv;
} gdata;

out vec4 outputColor;
uniform sampler2DMS tex;
uniform sampler1D lum_to_exposure;
uniform samplerBuffer weights;
uniform int samplecount;
uniform bool use_HDR;

vec4 tone_map(vec4 color, float exp)
{
    return 1.0f - exp2(-color * exp);
}

const ivec2 tc_offset[25] = ivec2[](ivec2(-2, -2), ivec2(-1, -2), ivec2(0, -2), ivec2(1, -2), ivec2(2, -2),
                                    ivec2(-2, -1), ivec2(-1, -1), ivec2(0, -1), ivec2(1, -1), ivec2(2, -1),
                                    ivec2(-2,  0), ivec2(-1,  0), ivec2(0,  0), ivec2(1,  0), ivec2(2,  0),
                                    ivec2(-2,  1), ivec2(-1,  1), ivec2(0,  1), ivec2(1,  1), ivec2(2,  1),
                                    ivec2(-2,  2), ivec2(-1,  2), ivec2(0,  2), ivec2(1,  2), ivec2(2,  2));

void main()
{
    ivec2 itexcoords = ivec2(floor(textureSize(tex) * gdata.texCoord));
    float tex_size_x = textureSize(tex).x;
    float tex_size_y = textureSize(tex).y;
    outputColor = vec4(0.0f, 0.0f, 0.0f, 1.0f);
    // for each sample in the multi sample buffer...
    for (int i = 0; i < samplecount; i++)
    {
        // ... calculate exposure based on the corresponding sample of nearby texels
        vec4 sample;
        if (use_HDR)
        {
            sample = texelFetch(tex, itexcoords, i);

            // look up a 5x5 area around the current texel
            vec4 hdr_samples[25];
            for (int j = 0; j < 25; ++j)
            {
                ivec2 coords = clamp(itexcoords + tc_offset[j], ivec2(0, 0), ivec2(tex_size_x, tex_size_y));
                hdr_samples[j] = texelFetch(tex, coords, i);
            }
            // average the surrounding texels
            vec4 area_color = (
                     ( 1.0f * (hdr_samples[0] + hdr_samples[4] + hdr_samples[20] + hdr_samples[24])) +
                     ( 4.0f * (hdr_samples[1] + hdr_samples[3] + hdr_samples[5] + hdr_samples[9]
                             + hdr_samples[15] + hdr_samples[19] + hdr_samples[21] + hdr_samples[23])) +
                     ( 7.0f * (hdr_samples[2] + hdr_samples[10] + hdr_samples[14] + hdr_samples[22])) +
                     (16.0f * (hdr_samples[6] + hdr_samples[8] + hdr_samples[16] + hdr_samples[18])) +
                     (26.0f * (hdr_samples[7] + hdr_samples[11] + hdr_samples[13] + hdr_samples[17])) +
                     (41.0f * (hdr_samples[12]))
                     ) / 273.0f;
            // RGB to luminance formula : lum = 0.3R + 0.59G + 0.11B
            float area_luminance = dot(area_color.rgb, vec3(0.3, 0.59, 0.11));
            float exposure = texture(lum_to_exposure, area_luminance/2.0).r;
            exposure = clamp(exposure, 0.02f, 20.0f);


            sample = tone_map(sample, exposure);
        }
        else
            sample = texelFetch(tex, itexcoords, i);

        // weight the sample based on its position
        float weight = texelFetch(weights, i).r;
        outputColor += sample * weight;
    }
}

score 5 · Accepted Answer · answered Mar 02 '11 at 07:37

I don't have a copy of the Superbible, so I don't know their exact proposition, but this approach seems very inefficient, and imprecise : your 5x5 filter is only accessing the 'i'th sample of each texel, and totally misses the other samples.

For the filtering phase, I'd go, as kvark already suggested, for a resolve in another texture using glBlitFramebuffer to have all samples accumulated in HDR. After that, doing the filter in another HDR texture, probably using a separable filter to gain performance, or even using GPU hardware to help increasing further the performance, using bilinear filtering.

This would given you a blurred texture that you could then sample in your tone mapping shader. This should vastly improve performance, but use more memory.

Note that other tone mapping operators exist, and that there is no 'ground truth' in this domain. You could choose to use more a performant approach by not using a such fine grained luminosity estimate.

You could look at Matt Pettineo's recent blog post about tone mapping, this could give you hints about how to improve things, perhaps by using glGenerateMipMaps to create the luminosity texture.

Regarding the specific issues about tone mapping with MSAA, the only thing I'm aware of is that it's recommended to tone map individual samples before the MSAA resolve, to prevent aliasing artifacts from appearing.

Thanks for the links, I'm still digesting all this information but it definitely helps clear up the theory a bit ! — Nicolas Lefebvre, Mar 02 '11 at 09:31
One point still confuses me : I had also read that MSAA resolve should be done after tone mapping (hence tone mapping each sample, then averaging the tone mapped samples in my naive attempt). Does your solution of using an intermediate blurred texture contradict this ? Or is the MSAA resolved twice ? (once by the blit to create a blurred texture that is thrown away after tone mapping, once manually on the results of the tone mapping) — Nicolas Lefebvre, Mar 02 '11 at 09:38
@Bethor - indeed, the intermediate HDR blurred texture is a resolve of the HDR MSAA buffer. But as it is *not* tone mapped, a simple resolve using a blit is sufficient. Its only role is to give an estimate of the luminosity of the HDR pixels in the area to tone map. After that it is thrown away as you mentioned. Another option, more tricky, would be to use the tone-mapped resolve, done the previous frame, to get an estimate of the current frame. Valve did this in Half-Life 2 in believe. — rotoglup, Mar 02 '11 at 21:37

score 2 · Answer 2 · answered Mar 01 '11 at 20:15

As far as I see from your GLSL code the weight for all samples of a pixel are equal. From that I conclude that the code is interested in the sum of those samples for each pixel. The sum is an average multiplied by the number of samples. From here at least two optimization techniques reveal. Both are using an intermediate single-sampled texture, from which your code is supposed to sample instead of the original multi-sampled one:

(doing it precise to what you are doing). Produce an intermediate texture with a shader that writes average of the samples for each pixel.
(approximating quickly). Let the intermediate texture to be just the resolved original one. Can be done effectively by calling glBlitFramebuffer(). This will produce slightly different result (because the sample locations are not on a grid), but for you task - HDR - it shouldn't matter, as it's all pretty much an approximation :)

Good luck!

Thanks for the suggestion, I'll give glBlitFramebuffer a shot to speed up the MSAA resolve ! — Nicolas Lefebvre, Mar 02 '11 at 09:31

HDR, adaptive tone mapping and MSAA in GLSL

2 Answers2

Linked