If you look at the general definition for the Phong reflection model, it is usually a summation over all the light sources with the form:
Σ kd * (L . N) * i + ks * (R . V)^a * i
In this case kd
and ks
are the diffuse and specular constants, L
, N
and R
are the relevant vectors, a
is the shininess and i
is the intensity of the current incoming light. Your version is slightly different as you can rewrite this in the way you did by splitting the summation and moving the constants out, but this is not as common a way to do it:
kd * Σ ((L . N) * i) + ks * Σ ((R . V)^a * i)
The reason why that is not done as much is due to how the general rendering equation works, which is typically in the form of an integral over a hemisphere at a point:
Lo(wo) = Le(wo) + ∫ f(wi, wo) * (wi . n) * Li(wi) dwi
In this case Lo
is the outgoing radiance along a direction wo
, Le
is the emissive contribution in a direction, f
is the BRDF, n
is the normal of the surface and Li
is the incoming radiance contribution in the incoming direction wi
(which is what is being integrated over). In practicality this is implemented as a summation, showing once more that the lighting contribution at a point in a direction is type of weighted sum of each individual lighting calculation in a direction. In the case of a simple renderer with point lights such as yours, this is simply the sum of each light's contribution, since it is assumed light can only come from a light source and not the environment itself. This isn't really a problem but if you ever plan on implementing a more complex lighting model then you'll have to rewrite your program's structure a bit.
The main issue however I suspect is just the fact that lighting in a ray traced scene is typically done in a linear space with no bounds, meaning that light will not always stay in the range 0-1 as you have observed. Lights can be represented with values much greater than 1 to differentiate for example the sun from a simple desk light, or in your case likely the combination of many small lights is causing the values on the surface to be much more than 1 when combined. While this is not a problem during rendering (and in fact must be this way for proper results), it is a problem for when you decide to finally present the image as monitors only accept either 8 bit or in the case of more modern HDR display devices, 10 bit color for each channel, meaning that somehow you must represent the entire floating point range of radiance in your scene as a much more limited integer range.
This process of going from HDR to LDR is typically done through tonemapping which is effectively an operation to squeeze the range of values down into something presentable in an "intelligent" way, whatever that may be. Various factors can be incorporated into tonemapping such as exposure which can be derived even from physically simulated camera properties such as its shutter speed, aperture and ISO (as we are used to how cameras capture the world as seen in movies and photographs), or can be crudely approximated as many video games do, or it can be ignored altogether. Additionally, the curve and "style" of the tonemapping operation is completely subjective, usually selected based on what looks appropriate for the content in question or chosen specifically by artists in the case of something such as a movie or a video game, meaning you can pretty much just choose whatever looks best to you as there is no one right answer (again it's usually based on the S shape curve film exhibits due to the widespread usage of cameras in media).
Even after the range of values has been converted to something more reasonable for display output however, the color space may still be incorrect and depending on how you are writing it to a display device, it may need to be gamma corrected by putting the values through an OETF (opto-electronic transfer function) to encode the output an electronic signal for the monitor. Typically, you do not have to worry about color space as most people all work on monitors in sRGB (a slight variant of Rec. 709) and use it directly in their applications, so unless you go out of your way to make the color space in your ray tracer something other than that, then it is nothing to worry about. On the other hand, gamma correction typically must be done as the default framebuffer in APIs such as OpenGL, Direct3D or Vulkan is usually in gamma space already (whereas lighting math as mentioned before is done in linear space), though if you're outputting to something such as an image then it might not be needed depending on the format.
As a summary though, you pretty much just need to apply a tonemapping operator and potentially gamma correct your final color output to get something looking reasonably correct. If you need a quick and dirty one you can try x / (x + 1)
(otherwise known as Reinhard tonemapping) where x would be the output from the ray tracing. You can multiply the input to this function by some arbitrary constant as well for a simple "exposure" adjustment if the output is too dark. Finally, if your output device is expecting something in gamma space, then you can take the tonemapped output and apply the function x^(1.0 / 2.2)
to it (note this is a slight simplification of the proper sRGB OETF but it's fine to use as long as you keep that in mind) to get it into gamma space, though again if you're outputting to an image this is not usually needed, but still something to keep in mind. Another thing to note is that tonemapping will typically output on the range 0-1, so if you do need to convert to 8 bit integers by multiplying by 255 or whatever an output image format for example may expect, this should be done after everything rather than before any of this, as multiplying it earlier will pretty much do nothing except make the scene look a lot brighter.
I'd also like to mention if you are ever planning on developing this ray tracer further into something more detailed such as a path tracer, using a Phong lighting model will not be sufficient due to its violation of the energy conservation properties expected by the rendering equation. Many BRDFs exist, even a relatively simple Phong-based one (with some slight modifications to get it to behave properly), so such a change would not require too much additional code but would improve the visual fidelity of the renderer and make it more future proof if more complex behavior is ever implemented.