This post comes some a place quite close to my research field, and honestly is in response to a growing concern at the lack of standardization of imagery within it. I mentioned on this blog before about how we can improve reporting on geoscientific imagery, how we can try to incorporate concepts like MTF into that reporting and the importance in moving towards open-source image sets within environmental research. I have grown envious again of the satellite image users, who are drawing from public data sources – their data can be accessed by anyone!
When you generate your own imagery things can become a little trickier, particularly when you may have taken thousands of images and can’t report on them all in the bounds of a scientific text, or don’t have the capacity to host them for distribution from a public server. Producing a short, snappy metadata summary on the quality of each image would move a long way towards doing this, as something like that is easily included within supplementary materials.
Whilst researchers would ideally include some control images of, for example, an ISO chart under specific lighting with the settings to be used before a survey, this is massively impractical.The silver bullet to this whole issue would be an objective image quality metric that could score any image, independent of equipment used and without any reference imagery/ground truth to compare it to (No Reference). This metric would need to account for image sharpness,exposure, distortions and focus which makes the whole thing phenomenally complicated, particularly where there are factor interactions.
My approach to big automation problems has changed in recent years, in much due to my increasing knowledge in image processing. The one thing we know is that it’s easy to tell a poor quality image from a good quality image, and we can express the reasons why in logical statements – plenty for a computer scientist to get on with! A functioning, easy to use NRIQA algorithm would be useful far outside the bounds of the geosciences and so research is very active in the field. In this blog post, I’ll look at an approach which is a common starting point.
Natural image statistics
Antonio Torralba’s paper ‘Statistics of natural image categories’ provided a great deal of insight to me on what to consider when thinking about image quality metrics, and I happened upon this paper after seeing a different piece of his work in a paper I was reading. I recommend glossing over it if you want some cutting insight into really basic ideas of how we distinguish image categories. Image gradients are king, and always have been a keen part of image understanding.
His work lead me to Chen and Boviks’ paper, with a very elegant paragraph/figure in the introductory section, highlighting how useful gradient analysis can be. They use images from the LIVE database, which I hadn’t come across previously and has proven an interesting resource.
They point out that, in general, blurred images do not contain sharp edges – sharp images will therefore retain higher amounts of high frequency gradient information (that where neighbouring pixels vary by bigger amounts). To demonstrate this, I’ve taken an image from the Middlebury stereo dataset and have produced gradient distributions on both the original and artificially blurred versions – we can see the effect in the same way the Chen and Bovik demonstrate!
For curiosity, I added a noise-degraded version, and we can see that has the opposite effect on gradients. I guess, in this basic case, sharp and noisy images would be hard to distinguish. Whilst I produced some which were both noisy and blurry, the noise dominates the signal and causes the flattening effect seen in the noisy line of the figure.
This is a useful insight that’s quick and easy to demonstrate – a good starting point for these type of analysis. They go on to develop an SVM model trained on sharp and blurry images using similar logic, with what look like some promising results. Within an image block, we could use this approach to separate gradient outliers we suspect might be blurry. This would be massively convenient for users, but doubly ensure some modicum of quality control.
Perhaps, if we were to cheat a bit, reference images (high quality, from a curated database, scaled to be appropriate for the survey) of the type of environment being investigated by the researcher could be used for a quick quality comparison in surveys in terms of gradients. One could then move to include global image information such as histogram mean into the metric, which for a well exposed image should be somewhere near the center of the pixel-value range.
This is a crude starting point perhaps, but a starting point nonetheless, and an area I hope geoscientists using images pay more attention to in the near future. Quality matters!