Chroma

I’ve been neglecting this blog of late, partly because I’ve been ill and partly because I’ve been focusing my writing efforts elsewhere, but thought it was due time I put something up. Followers might remember that last year at EGU I presented a poster detailing results of investigating the variation of the greyscale input channel into Structure-from-Motion (SfM) photogrammetric blocks. Whilst the results showed very slight differences, I didn’t present one interesting, subtle effect, which shows how robust the process is to differences within images.

Within the SfM process, camera parameters which correct for distortions in the lens are fitted, which can subsequently be extracted for separate analysis. Returning to the greyscaling theme for inclusion in my final thesis, I’m pulling out the lens models for each block, and noticed the focal length being fitted to each block subtly changing, but in a manner we might expect.

Chromatic aberration

Chromatic aberration is caused by differences in the refractive indices of the glass in the lens between light of different wavelengths, which causes the focal point of the image formed for each wavelength to be slightly different. Thus, in colour images and for other optical equipment (I remember seeing it in many different sets of binoculars), we can see colour banding around the edges of high contrast features.

DSC00128_crop

Chromatic aberration seen at the front (red fringe) and back (green fringe) of the candle

Within photogrammetric blocks using single channel, we might expect the focal length to be optimised for specifically that colour’s focal length as it interacts with the specific lens being used. Indeed, this is demonstrable in the tests I have run – we see a slight lengthening of the focal length as more of the red channel is introduced to the image block accounting for the interaction with the lens, testing on an RGB image set collected of a cliff near Hunstanton, UK.

focal_lengths.png

Self-calibrating bundle adjustment fits longer focal lengths to greyscale bands containing a greater proportion of the red channel from an RGB image. Colours of the plotted points represent the RGB colour combination the greyscale photogrammetric block was derived from. The larger circles represent pure red, green and blue channels.

Whilst this might be expected, I was surprised by how obvious a trend was being shown, and it’s testament to how sensitive SfM is at picking up even small changes in image blocks. Watch this space for more insight into what this means for assessing quality of images going into SfM procedures, and how we might gain intuition into image quality as a result of this trend!

Advertisements

Photogrammetry rules of thumb

I’ve uploaded a CloudCompare file of some fieldwork I did last year to my website here. It uses the UK national LiDAR inventory data, mentioned in the post here. I think it espouses lots of the fundamentals discussed here, and is a good starting point for thinking about network design.

80% overlap

This dates way back, and I’m unsure of where I heard it first, but 80% overlap between images in a photogrammetric block with a nadir viewing geometry is an old rule of thumb from aerial imaging (here’s a quick example I found from 1955), and carries through to SfM surveying. I think it should likely be a first port of call for amateurs doing surveys of surfaces, as it’s very easy to jot down an estimate before undertaking a survey. For this, we should consider just camera positions orthogonal to the surface normal (see this post) and estimate a ground sample distance to aid us with camera spacing from there.

1:1000 rule

This has become superseded in recent years, but is still a decent rule of thumb for beginners in photogrammetry. It says that, in general (very general!), the surface precision of a photogrammetric block will be around 1/1000th of the distance to the surface. Thus, if we are imaging a cliff face from 30m away, we can realistically expect accuracy to within 3 cm of that cliff. This is very useful, especially if you know beforehand the required accuracy of the survey. This is also a more stable starting point than GSD, whose quality as a metric which can vary widely depending on your camera selection.

Convergent viewing geometry

Multi-angular data is intuitively desirable to gather, with the additional data comes additional data processing considerations, but recently published literature has suggested that adding these views has the secondary effect of mitigating systematic errors within photogrammetric bundles. Thus, when imaging a surface, try and add cameras at off angles from the surface normal in order to build a ‘strong’ imaging network, to avoid systematic error creeping in.

Shoot in RAW where possible

Whilst maybe unnecessary for many applications, RAW images allow the user to capture a much great range of colour within an image, owing to the fact that colours are written on 12/14 bits rather than the 8 of JPG images. Adding to this, jpg compression can impact the quality of the 3D point clouds, so using uncompressed images is advised.

Mind your motion

Whilst SfM suggests that the camera is moving, we need to bear in mind that moving cameras are subject to blur, and this is sometimes difficult to detect, especially when shooting in tough conditions where you can’t afford to look at previews. Thus, you can pre-calculate a reasonable top speed for the camera to be moving, and stick to that. We recommend a maximum of 1.5 pixels in GSD over the course of each exposure given the literature and as advised by the OS.

Don’t overparameterize the lens model

Very recently, studies have suggested that overparameterizing the lens model, particularly when poorer quality equipment is being used without good ground control, can lead to a completely unsuitable lens model being fit which will impact the quality of results. The advice – only fit f, cx, cy, k1 and k2 parameters if you’re unsure of what you’re doing. This is far from the default settings in most software packages!

Conclusion

I had a few more points in my long list, but for now these 6 will suffice. Whilst I held back on camera selection here you can read my previous camera selection post for some insight into what you should be looking for. Hope this helps!

Notre Dame

SfM revisited

Snavely’s 2007 paper was one of the first breakout pieces of research bringing the power of bundle adjustment and self-calibration of unordered image collections to the community. It paved the way for the use of SfM in many other contexts, but I always appreciated how simple and focused the piece of work was, and how well explained each step in the process is.

snave

Reconstruction of Notre Dame from Snavely’s paper

For this contribution, I had hoped to try and recreate a figure from this paper, in which the front facade of the Notre Dame cathedral was reconstructed from internet images. I spent last weekend in Paris, so I decided I’d give a go at collecting my own images and pulling them together into a comparable model.

Whilst the doors of the cathedral were not successfully included due to the hordes of tourists in each image, the final model came out OK, and is view-able on my website here.

ND_cat.png

View of the Cathedral on Potree

HDR stacking

As a second mini-experiment, I thought I’d see how a HDR stack compared with a single exposure from my A7. The dynamic range of the A7, shooting from a tripod at ISO 50 is around 14EV stops, so  I wasn’t expecting a huge amount of dynamic range to be outside this, though potentially parts of the windows could be retrieved. For the experiment, I used both Hugin‘s HDR functionality and a custom python script using openCV bindings for generating HDR images which can be downloaded here.

Results were varied, with really only Merten’s method of HDR generation showing any notable improvement on the original input.

This slideshow requires JavaScript.

Some interesting things happened, including Hugin’s alignment algorithm misaligning the image (or miscalculating the lens distortion) to create a bowed out facade by default, pretty interesting to see! I believe, reading Robertson’s paper, his method was generated more to be used on grayscale images rather than full colour, but thought I’d leave the funky result in for completeness.

If we crop into the middle stain glass we can see some of the fine detail the HDR stacks might be picking up in comparison to the original JPG.

This slideshow requires JavaScript.

We can see a lot of the finer detail of the famous stained-glass windows revealed by Merten’s HDR method, which is very cool to see! I’m impressed with just how big the difference is between it and the default off-camera JPG.

Looking at the raw file from the middle exposure, much of the detail of the stain glass is still there, though has been clipped in the on-camera JPG processing.

fre

Original image processed from RAW and contrast boosted showing fine detail on stained glass

It justifies many of the lines of reasoning I’ve presented in the last few contributions on image compression, as these fine details can often reveal features of interest.

I had actually planned to present the results from a different experiment first, though will be returning to that in a later blog post as it requires much more explanation and data processing, watch this space for future contributions from Paris!

Control freak

In formulating a research design initially, I spent much time considering how best to control the experiments I was undertaking. Control, from a geoscientific photogrammetry perspective, can really be quite tricky, as the amount of settings and equipment involve can mean that one quickly loses the run of oneself.

Research planning

In my limited wisdom during the planning phase I actually undertook a plan to demonstrate exactly where we would capture imagery from, right down to the OSGB coordinates and orientation of the cameras in the scene, using Cloudcompare to help in visualization. I sourced the topographic data from the LiDAR inventory provided by the UK geomatics service, which provided a DEM with 0.5 m resolution.

FW1.png

A screenshot showing camera positions from my research plan

I think this was a very worthwhile task – it was very demanding in terms of the skills I needed to use and made me think about how far I could bring the experiment in the planning stage. While maybe overkill, I have visions of the near future where one might be able to task a robot with a built in RTK-GPS to acquire images from these exact positions/orientations daily for a specified time period. This would eliminate much of the bias seen in studies done over the same research area, but with different equipment and camera network geometries.

You could argue that this is already happening with programmable UAVs, though I haven’t seen anything that practical for a terrestrial scene. This is outside the scope of this post, but did provide motivation for expanding as much as possible in the planning phase.

So while we might be able to control camera positions and orientations, in the planning phase at least, there are some things we know are absolutely outside our control. The weather is the most obvious one, but with a cavalier attitude I thought how about I might go about controlling that too. This lead me to considering the practicalities of simulating the full SfM workflow.

To attempt this I took a model of Hunstanton which had previously been generated from a reconnaissance mission to Norfolk last May. It had been produced using Agisoft Photoscan and outputted as a textured ‘.obj’ file, a format which I wasn’t overly familiar with, but would become so. What followed was definitely an interesting experiment, though I’m willing to admit it probably wasn’t the most productive use of time.

Controlling the weather

Blender is an open source 3D animation software which I had been toying around with previously for video editing. It struck me that, considering blender actually has a physics based engine, there might be reasonable ways of simulating varying camera parameters within a scene with simulated lighting provided by a sun which we control.

Blender1.png

The Hunstanton obj file, with the Sun included

So the idea here is to put a sun directly overhead, and render some images of the cliff by moving the camera in the scene. For the initial proof of concept I took 5 images along a track, using settings imitating a Nikon D700 with a 24 mm lens, focused to 18 m (approx distance to cliff, from CloudCompare), with shutter speed set to 1/500 s (stationary camera) and ISO at 200. The aperture was f/8, but diffraction effects can’t be introduced in the software due to limitations in the physics engine. The 5 images are displayed below, with settings from the Physical Camera python plugin included at the end.

This slideshow requires JavaScript.

Full control! We have the absolute reference to compare what will be the newly generated model to, we can vary the camera settings to simulate the effects of motion blur, noise and focus and then but the degraded image sets through the software!

Plugging these 5 images back into Agisoft again, masking the regions where there is no data, produces a new point cloud purely derived from the simulation.

FW2.png

Dense point cloud produced from the simulated images

We can then load both the model and derived point cloud into CloudCompare and measure the Cloud-to-mesh distance.

fw_front

From the front

fw_front2

From the back

This is where I left my train of thought, as I needed to return back to doing some practical work. I still think there could be some value in this workflow, though it does definitely need to be hashed out some more – the potential for varying network geometry ontop of all the other settings is very attractive!

For now though, it’s back to real world data for me, as I’m still producing the results for the fieldwork I did back in October!

Too much JPEG!

Having read lots about the JPEG algorithm of late in my investigations of image quality, and having written about it’s effects on image gradients in my last post, I though it would be good to include an entry about it in this blog.

Whilst I invite the more curious reader to delve into the nuances of the algorithm, which in closely related to the Fourier transform which I’ve written about previously, today I’ll be looking past the black box by testing the same key parameter as in the last post which the user has control over, the ‘quality‘ setting. One thing we will note, however, is that the JPEG algorithm operates on 8 x 8 discrete pixel windows, which is one of the more noticeable things when the algorithm is applied at lower quality settings.

Let’s have a look at the impact of varying the quality of a cropped portion (1000 x 1000 pixels) of an image:

The impact at the lower end of the JPG images is dramatic. As the quality is set to 1, 8 x 8 pixel blocks are essentially assigned the same value, and so the image will downgrade visibly. As we increase the quality parameter, this compression will start to disappear, but at quality 25 we can still see some degree of ‘blockiness’ due to the 8 x 8 pixel windows still varying to a large enough degree.

However, past around quality 50 the impact is much more subtle, and I tend not to be able to tell the difference for images cropped to this size. This elucidates the point: The JPG algorithm is amazing at the amount one can save, in terms of file size, in an image.

Let’s take a look at one more set of crops, this time the same image as above, but cropped to just 200 x 200 pixels:

The ‘blockiness’ is certainly evident at quality 50, and less subtle but notable at quality 75. I think the most astounding thing is the lack of perceptible difference between quality 92 and 100, given the file size difference. We can investigate where the difference lies using a comparison image (imagemagick’s compare function), where red pixels show different values. I will also include the difference image between the two cropped sections, which should offer some insight into the spatial distribution of pixel variations, if any exist:

So The mean variation between digital numbers for pixels in each 8 bit band is 1.5, but the file size saving is nearly 75%! The difference image shows that the digital number differences are concentrated in areas of high frequency information, such as along the cracks in the rock wall, areas which could be very important in delineating boundaries, for example.

While subtle, for work which involves photogrammetric precision these effects have not been so well documented – this is one thing I’m working towards within my PhD research. Oftentimes researchers will use JPEGs taken off the camera used, which can have custom filters applied prior to use, making reporting and replication more difficult. If we need to compare research done with different equipment under various lighting conditions on various days, this is one part of the research workflow which is crying out for standardization, as the effects, at least in the case of this one simple example, are clear.

For a visualization of a stack of every quality setting for the first set of crops, please visit this link to my website.

 

 

NRIQA! No Reference Image Quality Assessment

This post comes some a place quite close to my research field, and honestly is in response to a growing concern at the lack of standardization of imagery within it. I mentioned on this blog before about how we can improve reporting on geoscientific imagery, how we can try to incorporate concepts like MTF into that reporting and the importance in moving towards open-source image sets within environmental research. I have grown envious again of the satellite image users, who are drawing from public data sources – their data can be accessed by anyone!

When you generate your own imagery things can become a little trickier, particularly when you may have taken thousands of images and can’t report on them all in the bounds of a scientific text, or don’t have the capacity to host them for distribution from a public server. Producing a short, snappy metadata summary on the quality of each image would move a long way towards doing this, as something like that is easily included within supplementary materials.

Whilst researchers would ideally include some control images of, for example, an ISO chart under specific lighting with the settings to be used before a survey, this is massively impractical.The silver bullet to this whole issue would be an objective image quality metric that could score any image, independent of equipment used and without any reference imagery/ground truth to compare it to (No Reference). This metric would need to account for image sharpness,exposure, distortions and focus which makes the whole thing phenomenally complicated, particularly where there are factor interactions.

My approach to big automation problems has changed in recent years, in much due to my increasing knowledge in image processing. The one thing we know is that it’s easy to tell a poor quality image from a good quality image, and we can express the reasons why in logical statements – plenty for a computer scientist to get on with! A functioning, easy to use NRIQA algorithm would be useful far outside the bounds of the geosciences and so research is very active in the field. In this blog post, I’ll look at an approach which is a common starting point.

Natural image statistics

Antonio Torralba’s paper ‘Statistics of natural image categories’ provided a great deal of insight to me on what to consider when thinking about image quality metrics, and I happened upon this paper after seeing a different piece of his work in a paper I was reading. I recommend glossing over it if you want some cutting insight into really basic ideas of how we distinguish image categories. Image gradients are king, and always have been a keen part of image understanding.

His work lead me to Chen and Boviks’ paper, with a very elegant paragraph/figure in the introductory section, highlighting how useful gradient analysis can be. They use images from the LIVE database, which I hadn’t come across previously and has proven an interesting resource.

They point out that, in general, blurred images do not contain sharp edges – sharp images will therefore retain higher amounts of high frequency gradient information (that where neighbouring pixels vary by bigger amounts). To demonstrate this, I’ve taken an image from the Middlebury stereo dataset and have produced gradient distributions on both the original and artificially blurred versions – we can see the effect in the same way the Chen and Bovik demonstrate!

For curiosity, I added a noise-degraded version, and we can see that has the opposite effect on gradients. I guess, in this basic case, sharp and noisy images would be hard to distinguish. Whilst I produced some which were both noisy and blurry, the noise dominates the signal and causes the flattening effect seen in the noisy line of the figure.

This is a useful insight that’s quick and easy to demonstrate – a good starting point for these type of analysis. They go on to develop an SVM model trained on sharp and blurry images using similar logic, with what look like some promising results. Within an image block, we could use this approach to separate gradient outliers we suspect might be blurry. This would be massively convenient for users, but doubly ensure some modicum of quality control.

Perhaps, if we were to cheat a bit, reference images (high quality, from a curated database, scaled to be appropriate for the survey) of the type of environment being investigated by the researcher could be used for a quick quality comparison in surveys in terms of gradients. One could then move to include global image information such as histogram mean into the metric, which for a well exposed image should be somewhere near the center of the pixel-value range.

This is a crude starting point perhaps, but a starting point nonetheless, and an area I hope geoscientists using images pay more attention to in the near future. Quality matters!