Artifacts with phong shading and ray-tracing

Question

I am currently experimenting different artifacts when implementing phong shading on my ray-tracer.

The first case case happens when I implement the specular lighting calculation the way I suppose to be right: adding the contributions of the light sources cumulatively with:

specular_color += light_intensity * std::pow (std::max(0.f,reflected*camera_dir),mat.ns);

teapot-overflow

However, if I do not accumulate the contribution, by means of

specular_color = light_intensity * std::pow (std::max(0.f,reflected*camera_dir),mat.ns);

I get this:

teapot-sem-verflow

Seems closer, but still with some artifacts.

Printing some of the values assumed by the specular_color variable (that is a 3 float object) I receive values as high as

specular_color (200) after: 1534.73 1534.73 1534.73

for pixel of x and y=200 with + sign added

without it I receive:

specular_color (200) after: 0 0 0

All these float values are clamped with

a [ctr] = min (final_color.blue*255.0f,255.0f);
     a [ctr+1] = min (final_color.green*255.0f,255.0f);
     a [ctr+2] = min (final_color.red*255.0f,255.0f);

for file writing

And final_value is nothing more than:

final_color = diffuse_color * mat.ks + specular_color * mat.kd;

The components of specular_color (light_intensity,reflected,camera_dir) seems to be right, since are used in other places without problems.

So I would appreciate any suggestions of where an error might be and how to fix it.

Lemon Drop · Answer 1 · 2019-06-06T01:23:50.410

If you look at the general definition for the Phong reflection model, it is usually a summation over all the light sources with the form:

Σ kd * (L . N) * i + ks * (R . V)^a * i

In this case kd and ks are the diffuse and specular constants, L, N and R are the relevant vectors, a is the shininess and i is the intensity of the current incoming light. Your version is slightly different as you can rewrite this in the way you did by splitting the summation and moving the constants out, but this is not as common a way to do it:

kd * Σ ((L . N) * i) + ks * Σ ((R . V)^a * i)

The reason why that is not done as much is due to how the general rendering equation works, which is typically in the form of an integral over a hemisphere at a point:

Lo(wo) = Le(wo) + ∫ f(wi, wo) * (wi . n) * Li(wi) dwi

In this case Lo is the outgoing radiance along a direction wo, Le is the emissive contribution in a direction, f is the BRDF, n is the normal of the surface and Li is the incoming radiance contribution in the incoming direction wi (which is what is being integrated over). In practicality this is implemented as a summation, showing once more that the lighting contribution at a point in a direction is type of weighted sum of each individual lighting calculation in a direction. In the case of a simple renderer with point lights such as yours, this is simply the sum of each light's contribution, since it is assumed light can only come from a light source and not the environment itself. This isn't really a problem but if you ever plan on implementing a more complex lighting model then you'll have to rewrite your program's structure a bit.

The main issue however I suspect is just the fact that lighting in a ray traced scene is typically done in a linear space with no bounds, meaning that light will not always stay in the range 0-1 as you have observed. Lights can be represented with values much greater than 1 to differentiate for example the sun from a simple desk light, or in your case likely the combination of many small lights is causing the values on the surface to be much more than 1 when combined. While this is not a problem during rendering (and in fact must be this way for proper results), it is a problem for when you decide to finally present the image as monitors only accept either 8 bit or in the case of more modern HDR display devices, 10 bit color for each channel, meaning that somehow you must represent the entire floating point range of radiance in your scene as a much more limited integer range.

This process of going from HDR to LDR is typically done through tonemapping which is effectively an operation to squeeze the range of values down into something presentable in an "intelligent" way, whatever that may be. Various factors can be incorporated into tonemapping such as exposure which can be derived even from physically simulated camera properties such as its shutter speed, aperture and ISO (as we are used to how cameras capture the world as seen in movies and photographs), or can be crudely approximated as many video games do, or it can be ignored altogether. Additionally, the curve and "style" of the tonemapping operation is completely subjective, usually selected based on what looks appropriate for the content in question or chosen specifically by artists in the case of something such as a movie or a video game, meaning you can pretty much just choose whatever looks best to you as there is no one right answer (again it's usually based on the S shape curve film exhibits due to the widespread usage of cameras in media).

Even after the range of values has been converted to something more reasonable for display output however, the color space may still be incorrect and depending on how you are writing it to a display device, it may need to be gamma corrected by putting the values through an OETF (opto-electronic transfer function) to encode the output an electronic signal for the monitor. Typically, you do not have to worry about color space as most people all work on monitors in sRGB (a slight variant of Rec. 709) and use it directly in their applications, so unless you go out of your way to make the color space in your ray tracer something other than that, then it is nothing to worry about. On the other hand, gamma correction typically must be done as the default framebuffer in APIs such as OpenGL, Direct3D or Vulkan is usually in gamma space already (whereas lighting math as mentioned before is done in linear space), though if you're outputting to something such as an image then it might not be needed depending on the format.

As a summary though, you pretty much just need to apply a tonemapping operator and potentially gamma correct your final color output to get something looking reasonably correct. If you need a quick and dirty one you can try x / (x + 1) (otherwise known as Reinhard tonemapping) where x would be the output from the ray tracing. You can multiply the input to this function by some arbitrary constant as well for a simple "exposure" adjustment if the output is too dark. Finally, if your output device is expecting something in gamma space, then you can take the tonemapped output and apply the function x^(1.0 / 2.2) to it (note this is a slight simplification of the proper sRGB OETF but it's fine to use as long as you keep that in mind) to get it into gamma space, though again if you're outputting to an image this is not usually needed, but still something to keep in mind. Another thing to note is that tonemapping will typically output on the range 0-1, so if you do need to convert to 8 bit integers by multiplying by 255 or whatever an output image format for example may expect, this should be done after everything rather than before any of this, as multiplying it earlier will pretty much do nothing except make the scene look a lot brighter.

I'd also like to mention if you are ever planning on developing this ray tracer further into something more detailed such as a path tracer, using a Phong lighting model will not be sufficient due to its violation of the energy conservation properties expected by the rendering equation. Many BRDFs exist, even a relatively simple Phong-based one (with some slight modifications to get it to behave properly), so such a change would not require too much additional code but would improve the visual fidelity of the renderer and make it more future proof if more complex behavior is ever implemented.

Hi @Lemon Drop. Thanks for the throughout explanation. In the case of the Lo(wo) = Le(wo) + ∫ f(wi, wo) * (wi . n) * Li(wi) dwi, the kd and ks coefficients cannot leave the summing that represents the integral precisely why? It's part of the f (wi,wo) right? So moving it outside will affect the (wi . n) * Li(wi) dwi part of the equation, while it shouldn't? — user2752471, Jun 11 '19 at 23:01
@user2752471 I suppose in a simpler lighting model the kd and ks can probably be written outside too, but in general it's just not since it does make the BRDF a bit more confusing to read when it's separated out like that, not to mention it'd require two loops when only one is really needed. Additionally some BRDFs base ks on fresnel and derive kd from that, so it becomes dependent anyways on things within the integral such as wi which would prevent it from being extracted as a constant. — Lemon Drop, Jun 12 '19 at 02:20

score 2 · Answer 2 · answered Jun 06 '19 at 00:20

First suggestion: Don't represent the intensity of light sources using 0.0 - 255.0. Use 0.0 to 1.0. Scaling and accumulation will work better. When it's time to display, you map the final intensity to a pixel value in the 0 - 255 range.

If you represent your "brightest" light as 255, and your scene has one such light, then you can get away with it. But if you then add a second light, any given pixel could potentially be illuminated by both lights and end up twice as bright as the brightest thing you can represent, which is basically what's happening in your first example--most of your pixels are too bright to display.

To normalize for lights, you'd have to have extra divisions and multiplications by number_of_lights * 255. That gets messy. Intensity scales and accumulates linearly, but pixel values don't. So work with intensity and convert to pixel values at the end.

To do the mapping, you have a few options.

Find the lowest and highest intensities in your image, and use a linear mapping to transform them into pixel values from 0 - 255.

For example, if your lowest intensity is 0.1, and your highest is 12.6, then you can compute each pixel value like this:
```
pixel_value = (intensity - 0.1) / (12.6 - 0.1) * 255;
```
This isn't very realistic (physically) but it's sufficient to get decent results regardless of how much (or little) light you have in the scene. You're effectively doing a crude "auto-exposure." Unfortunately, your "dark" scenes will seem too bright and your bright scenes may seem to dark.
The actual response curve of eyes and film to light intensity is not linear. It's usually an s-shaped curve, which can often be approximated by something like:
```
response = 1 - exp(intensity / exposure);
pixel_value = 255 * response;  // clamp to 0 - 255
```
There's a really good explanation of the exposure equation here. Basically, adding a second light of equal intensity should not make the pixels twice as bright because that's not really how we (or film) perceives brightness.

Response curves can be even more complicated. Old-school photo-sensitive film has strange properties. For example, a long exposure with film can result in a different image than an "equivalent" exposure taken with a fast shutter.

When you want to get super accurate, you also look at things like the gamma of your viewing system, which tries to account for non-linearities not only in perception, but also in the display and sensor. If I understand correctly, HDR is ultimately about careful mapping of measured intensity to display values in order to preserve contrast across a wide range of intensities.

Finally, while it's not directly related to the problem you showed, it looks like you reversed the diffuse and specular material properties in the snippet of code you included:

final_color = diffuse_color * mat.ks + specular_color * mat.kd;

I assume you meant to use mat.ks for specular and mat.kd for diffuse. This might make things confusing when you're trying to adjust these values.

Thanks for the explanation, very interesting, One doubt that came up on my head was about the exposure value. It's a user defined value right? Is there a way to obtain a good idea of some nice start values, or I have to guess it and see how it looks like? — user2752471, Jun 11 '19 at 22:49
@user2752471: The article I linked to in #2 uses a generic constant K to account for aperature, shutter speed, and sensitivity of the film. I'd start by setting that to 1 and then adjust it as necessary, either by doubling it or halving it, until you get something that looks right. In a physical simulation, the brightness of your light sources would have actual physical units and you could work out a reasonable value. But most ray tracers just use an imaginary, qualitative unit that ranges from "very dark" to "very bright." — Adrian McCarthy, Jun 12 '19 at 16:03

Artifacts with phong shading and ray-tracing

2 Answers2