Luminance (somewhat analogous to Radiance which I’ve discussed before) is a funny thing, it and its effects on human color perception have been discussed at length in academic circles. It is thankfully an SI unit, and can ground us in beginning discussions on how to interpret image data. The benefits of using SI units, as discussed before on this blog, include being able to conduct lab tests within which we have an absolute unit on which to control experiments.
Luminance has its issues however, as it is altogether separate from any chromatic information contained in a pixel due to the fact that it’s a measure of power. This is summed up neatly within the Helmholtz-Kohlraush effect, where for certain RGB-Gray conversions, different colours which are very apparent to the human observer will be converted to the same histogram in grayscale due to having the same luminance (Figure 1, colour image above the line – converted to grayscale below it).
Recently, we’ve been seeing more discussions on these issues as multi-view stereo becomes more popular due to the ease of accessibility. Within certain software packages (Such as VisualSfM, MicMac) we see the standard NTSC mapping being used, typified by a linear conversion:
There are many sensible reasons why this is used, but are mainly grounded on tri-stimulus color theory with regards to human vision. The fact that this has the potential to map different colours to the same luminance points towards an area which potentially hasn’t been optimized in stereo setups, though frequently discussed (Benedetti et al. give a good summary). If we encounter isoluminant surfaces and are engaging in any sort of sparse/dense stereo matching we may get into trouble.
To remedy this, many scientists have embarked to try to optimize the use of chrominance information in these conversions (See Grundland, 2005 for one example or this page for a summary of various work). This is easier said than done, as in the challenging conditions where this will particularly needed such as a bright scene with multiple isoluminant colours, having a limited number of histogram bins in which to place the data can be tricky.
For multi-view stereo, there is another consideration which hasn’t been as widely discussed. While performing non-linear conversions can be very useful, once these are applied across an image block the conversions may yield inconsistent results depending on the image content. I can imagine image blocks where this may be particularly apparent, for example when there is a dramatic color change over several images which doesn’t exist in other parts of the block. Thus, when considering multi-view stereo when using conversions where color-balance will be lost my gut says that there’s a chance we’re over-processing.
I’ve been considering this for several months now, and as we begin to see claims of improvements in the process, I’ll be sure to keep on reading and give my 2 cents on which methods I think would be best, stay tuned for more!