Chroma

I’ve been neglecting this blog of late, partly because I’ve been ill and partly because I’ve been focusing my writing efforts elsewhere, but thought it was due time I put something up. Followers might remember that last year at EGU I presented a poster detailing results of investigating the variation of the greyscale input channel into Structure-from-Motion (SfM) photogrammetric blocks. Whilst the results showed very slight differences, I didn’t present one interesting, subtle effect, which shows how robust the process is to differences within images.

Within the SfM process, camera parameters which correct for distortions in the lens are fitted, which can subsequently be extracted for separate analysis. Returning to the greyscaling theme for inclusion in my final thesis, I’m pulling out the lens models for each block, and noticed the focal length being fitted to each block subtly changing, but in a manner we might expect.

Chromatic aberration

Chromatic aberration is caused by differences in the refractive indices of the glass in the lens between light of different wavelengths, which causes the focal point of the image formed for each wavelength to be slightly different. Thus, in colour images and for other optical equipment (I remember seeing it in many different sets of binoculars), we can see colour banding around the edges of high contrast features.

DSC00128_crop

Chromatic aberration seen at the front (red fringe) and back (green fringe) of the candle

Within photogrammetric blocks using single channel, we might expect the focal length to be optimised for specifically that colour’s focal length as it interacts with the specific lens being used. Indeed, this is demonstrable in the tests I have run – we see a slight lengthening of the focal length as more of the red channel is introduced to the image block accounting for the interaction with the lens, testing on an RGB image set collected of a cliff near Hunstanton, UK.

focal_lengths.png

Self-calibrating bundle adjustment fits longer focal lengths to greyscale bands containing a greater proportion of the red channel from an RGB image. Colours of the plotted points represent the RGB colour combination the greyscale photogrammetric block was derived from. The larger circles represent pure red, green and blue channels.

Whilst this might be expected, I was surprised by how obvious a trend was being shown, and it’s testament to how sensitive SfM is at picking up even small changes in image blocks. Watch this space for more insight into what this means for assessing quality of images going into SfM procedures, and how we might gain intuition into image quality as a result of this trend!

Advertisements

Leafiness

I thought it might be fun to try something different, and delve back into the world of satellite remote sensing (outside of Sentinel_bot, which isn’t a scientific tool). It’s been a while since I’ve tried anything like this, and my skills have definitely degraded somewhat, but I decided to fire up GrassGIS and give it a go with some publicly available data.

I set myself a simple task of trying to guess how ‘leafy’ streets are within an urban for urban environment from Landsat images. Part of the rationale was that whilst we could count trees using object detectors, this requires high resolution images. While I might do a blog on this at a later date, it was outside the scope of what I wanted to achieve here which is at a very coarse scale. I will be using a high resolution aerial image for ground truthing!

For the data, I found an urban area on USGS Earth Explorer with both high resolution orthoimagery and a reasonably cloud free image which were within 10 days of one another in acquisition. This turned out to be reasonably difficult to find, with the aerial imagery being the main limiting factor, but I found a suitable area in Cleveland, Ohio.

The aerial imagery is a 30 cm resolution having been acquired using a Williams ZI Digital Mapping Camera, and was orthorectified prior to download. For the satellite data, a Landsat 5 Thematic Mapper raster was acquired covering the area of interest, with a resolution of 30 m in the bands we are interested in.

This experiment sought to use the much researched NDVI, a simple index used for recovering an estimate of vegetation presence and health.

Initially, I loaded both datasets into QGIS to get an idea of the resolution differences

jezzer.png

Aerial image overlain on Landsat 5 TM data (green channel)

So a decent start, looks like our data is valid in some capacity and should be an interesting mini-experiment to run! The ground truth data is resolute enough to let us know how the NDVI is doing, and will be used farther downstream.

 

Onto GrassGIS, which I’ve always known has great features for processing satellite imagery, though I’ve never used. It’s also largely built on python, which is my coding language of choice, so I feel very comfortable troubleshooting the many errors fired at me!

The bands were loaded, DN -> reflectance conversion done (automatically, using GrassGIS routines) and a subsequent NDVI raster derived.

ndvi2.png

Aerial image overlain on NDVI values. Lighter pixels denote a higher presence of vegetation

Cool! We’ve got our NDVI band, and can ground truth it against the aerial photo as planned.

ndvi1

Lighter values were seen around areas containing vegetation

Last on the list is grabbing a vector file with street data for the area of interest so we can limit the analysis to just pixels beside or on streets. I downloaded the data from here and did a quick clip to the area of interest.

roads1.png

Vector road network (in yellow) for our aerial image. Some new roads appear to have been built.

I then generated a buffer from the road network vector file, and generated a raster mask from this so only data within 20 m of a road would be included in analyses. The result is a first stab at our leafy streets index!

map1.jpg

Visual inspection suggests it’s working reasonably well when compared with the reference aerial image, a few cropped examples are shown below.

This slideshow requires JavaScript.

Lastly, we can use this this data to scale things up, and make a map of the wider area in Cleveland. This would be simple to do for anywhere with decent road data.

map3.jpgThis might be useful for sending people on the scenic route, particularly in unfamiliar locations. Another idea might be to use it in a property search, or see if there’s a correlation with real estate prices. Right now I’ve run out of time for this post, but might return to the theme at a later date!

 

Control freak

In formulating a research design initially, I spent much time considering how best to control the experiments I was undertaking. Control, from a geoscientific photogrammetry perspective, can really be quite tricky, as the amount of settings and equipment involve can mean that one quickly loses the run of oneself.

Research planning

In my limited wisdom during the planning phase I actually undertook a plan to demonstrate exactly where we would capture imagery from, right down to the OSGB coordinates and orientation of the cameras in the scene, using Cloudcompare to help in visualization. I sourced the topographic data from the LiDAR inventory provided by the UK geomatics service, which provided a DEM with 0.5 m resolution.

FW1.png

A screenshot showing camera positions from my research plan

I think this was a very worthwhile task – it was very demanding in terms of the skills I needed to use and made me think about how far I could bring the experiment in the planning stage. While maybe overkill, I have visions of the near future where one might be able to task a robot with a built in RTK-GPS to acquire images from these exact positions/orientations daily for a specified time period. This would eliminate much of the bias seen in studies done over the same research area, but with different equipment and camera network geometries.

You could argue that this is already happening with programmable UAVs, though I haven’t seen anything that practical for a terrestrial scene. This is outside the scope of this post, but did provide motivation for expanding as much as possible in the planning phase.

So while we might be able to control camera positions and orientations, in the planning phase at least, there are some things we know are absolutely outside our control. The weather is the most obvious one, but with a cavalier attitude I thought how about I might go about controlling that too. This lead me to considering the practicalities of simulating the full SfM workflow.

To attempt this I took a model of Hunstanton which had previously been generated from a reconnaissance mission to Norfolk last May. It had been produced using Agisoft Photoscan and outputted as a textured ‘.obj’ file, a format which I wasn’t overly familiar with, but would become so. What followed was definitely an interesting experiment, though I’m willing to admit it probably wasn’t the most productive use of time.

Controlling the weather

Blender is an open source 3D animation software which I had been toying around with previously for video editing. It struck me that, considering blender actually has a physics based engine, there might be reasonable ways of simulating varying camera parameters within a scene with simulated lighting provided by a sun which we control.

Blender1.png

The Hunstanton obj file, with the Sun included

So the idea here is to put a sun directly overhead, and render some images of the cliff by moving the camera in the scene. For the initial proof of concept I took 5 images along a track, using settings imitating a Nikon D700 with a 24 mm lens, focused to 18 m (approx distance to cliff, from CloudCompare), with shutter speed set to 1/500 s (stationary camera) and ISO at 200. The aperture was f/8, but diffraction effects can’t be introduced in the software due to limitations in the physics engine. The 5 images are displayed below, with settings from the Physical Camera python plugin included at the end.

This slideshow requires JavaScript.

Full control! We have the absolute reference to compare what will be the newly generated model to, we can vary the camera settings to simulate the effects of motion blur, noise and focus and then but the degraded image sets through the software!

Plugging these 5 images back into Agisoft again, masking the regions where there is no data, produces a new point cloud purely derived from the simulation.

FW2.png

Dense point cloud produced from the simulated images

We can then load both the model and derived point cloud into CloudCompare and measure the Cloud-to-mesh distance.

fw_front

From the front

fw_front2

From the back

This is where I left my train of thought, as I needed to return back to doing some practical work. I still think there could be some value in this workflow, though it does definitely need to be hashed out some more – the potential for varying network geometry ontop of all the other settings is very attractive!

For now though, it’s back to real world data for me, as I’m still producing the results for the fieldwork I did back in October!

Image compression

Probably my final post on image gradients, I thought I’d include one last mini-experiment on the effect of image compression on image gradient histograms, a la my previous posts NRIQA! No Reference Image Quality Assessment and Blur detection. Using the same script, I generated three image histograms of the same image, though with different RAW -> JPEG/TIF conversions, before analysis in 8 bits using the script.

JPEG ‘quality’ settings are to with the level of compression in the files vs. the originals. While a full review of what JPEG compression is is beyond the scope of this post, at a high level this blog post is very good at presenting the compression artifacts associated with JPEG images.

The RAW file in this case is a .NEF taken from a Nikon D700.

The three images generated were:

  1. Default RAW -> TIF conversion using imagemagick (built on dcraw). This is converted to 8 bits using OpenCV within the script. [Size = 70 Mb]
  2. Default RAW -> JPEG conversion using imagemagick. The default ‘quality’ parameter is 92. The image is visually indistinguishable from the much larger TIF. [Size =3 Mb]
  3. RAW -> JPEG conversion using a ‘quality’ setting of 25. The image is visually degraded and blocky. [Size = 600 Kb]
_dsc8652_tif_tog

TIF image

_dsc8652_jpg_tog

Default JPEG (‘quality’ = 92)

_dsc8652_jpg25_tog

JPEG with ‘quality’ of 25

In a general sense, the script tells us that there is more high frequency (abrupt changes in pixel value) in the Y direction within this image. The comparison between the TIF and default JPEG values shows almost no difference. Within JPEG compression using quality values greater than 90, there is no chroma downsampling, so the differences between the TIF and JPEG images are likely not due to RGB -> gray differences.

The JPEG at quality 25 shows clear signs of quantization – the blocky artifacts are visibily smoothing the image gradients. This pushes the neighbouring pixel changes towards the center of the histogram range, evidence of the fact.

It’s interesting that no signs of degradation are visible within the first two images and it’s actually quite difficult to see where the differences are. For one last test, I subtracted one from the other and did a contrast stretch to see where the differences are occuring. The subtleties of the JPEG compression are revealed – at a pixel level the differences range from -16 to +15 in DN – the larger differences seem reserved for grassy areas.

diff.png

Difference image between default TIF and JPEG images (TIF – JPEG)

Will these subtle changes affect how computer vision algorithms treat these images? Or how will it affect image matching? Can we envision a scenario where these would matter (if we were calculating absolute units such as Radiance, for example)?

Questions which need addressing, in this author’s opinion!

Blur detection

I thought I’d supplement the recent blog post I did on No-Reference Image Quality Assessment with the script I used for generating the gradient histograms included with the sample images.

I imagine this would be useful as a start for generating a blur detection algorithm, but for the purposes of this blog post I’ll just direct you to the script on github here. The script takes one argument, the image name (example: ‘python Image_gradients.py 1.jpg’).  Sample input-output is below.

fusion_mertens

Input image

Image_gradients.png

Plot generated

 

 

NRIQA! No Reference Image Quality Assessment

This post comes some a place quite close to my research field, and honestly is in response to a growing concern at the lack of standardization of imagery within it. I mentioned on this blog before about how we can improve reporting on geoscientific imagery, how we can try to incorporate concepts like MTF into that reporting and the importance in moving towards open-source image sets within environmental research. I have grown envious again of the satellite image users, who are drawing from public data sources – their data can be accessed by anyone!

When you generate your own imagery things can become a little trickier, particularly when you may have taken thousands of images and can’t report on them all in the bounds of a scientific text, or don’t have the capacity to host them for distribution from a public server. Producing a short, snappy metadata summary on the quality of each image would move a long way towards doing this, as something like that is easily included within supplementary materials.

Whilst researchers would ideally include some control images of, for example, an ISO chart under specific lighting with the settings to be used before a survey, this is massively impractical.The silver bullet to this whole issue would be an objective image quality metric that could score any image, independent of equipment used and without any reference imagery/ground truth to compare it to (No Reference). This metric would need to account for image sharpness,exposure, distortions and focus which makes the whole thing phenomenally complicated, particularly where there are factor interactions.

My approach to big automation problems has changed in recent years, in much due to my increasing knowledge in image processing. The one thing we know is that it’s easy to tell a poor quality image from a good quality image, and we can express the reasons why in logical statements – plenty for a computer scientist to get on with! A functioning, easy to use NRIQA algorithm would be useful far outside the bounds of the geosciences and so research is very active in the field. In this blog post, I’ll look at an approach which is a common starting point.

Natural image statistics

Antonio Torralba’s paper ‘Statistics of natural image categories’ provided a great deal of insight to me on what to consider when thinking about image quality metrics, and I happened upon this paper after seeing a different piece of his work in a paper I was reading. I recommend glossing over it if you want some cutting insight into really basic ideas of how we distinguish image categories. Image gradients are king, and always have been a keen part of image understanding.

His work lead me to Chen and Boviks’ paper, with a very elegant paragraph/figure in the introductory section, highlighting how useful gradient analysis can be. They use images from the LIVE database, which I hadn’t come across previously and has proven an interesting resource.

They point out that, in general, blurred images do not contain sharp edges – sharp images will therefore retain higher amounts of high frequency gradient information (that where neighbouring pixels vary by bigger amounts). To demonstrate this, I’ve taken an image from the Middlebury stereo dataset and have produced gradient distributions on both the original and artificially blurred versions – we can see the effect in the same way the Chen and Bovik demonstrate!

For curiosity, I added a noise-degraded version, and we can see that has the opposite effect on gradients. I guess, in this basic case, sharp and noisy images would be hard to distinguish. Whilst I produced some which were both noisy and blurry, the noise dominates the signal and causes the flattening effect seen in the noisy line of the figure.

This is a useful insight that’s quick and easy to demonstrate – a good starting point for these type of analysis. They go on to develop an SVM model trained on sharp and blurry images using similar logic, with what look like some promising results. Within an image block, we could use this approach to separate gradient outliers we suspect might be blurry. This would be massively convenient for users, but doubly ensure some modicum of quality control.

Perhaps, if we were to cheat a bit, reference images (high quality, from a curated database, scaled to be appropriate for the survey) of the type of environment being investigated by the researcher could be used for a quick quality comparison in surveys in terms of gradients. One could then move to include global image information such as histogram mean into the metric, which for a well exposed image should be somewhere near the center of the pixel-value range.

This is a crude starting point perhaps, but a starting point nonetheless, and an area I hope geoscientists using images pay more attention to in the near future. Quality matters!

 

 

 

 

 

 

Pixel shift resolution – Pentax K3

As part of fieldwork for my PhD project on the weekend before last, I collected a large amount of data, including images taken with a relatively new camera (released in May 2015), the Pentax K3ii. It features an APS-C sensor (1.6x crop factor) and 24 megapixels on the sensor, but one intriguing quality that lead us to experimenting was it was the so called ‘pixel-shift (PS) resolution’ mode touted by it’s makers as increasing the effective resolution of the images it gathers (there are some sample images in the gallery here, showing PS mode). It does this by taking 4 images, each shifted by one pixel in each direction. Due to the effect of the Colour-filter array present over almost all consumer cameras, this makes up the colour data being filtered out by effectively stacking colour information on every pixel.

While the dataset I gathered will take quite a while to process in full, I thought for this blog post I’d take a look at an image taken from the same position with pixel shift both on and off. For the work with the K3ii I used a 35mm Pentax f/2.4 lens rented from SRS microsystems.

Firstly, a specific PS development software, Silkypix, was required in order to develop the PS images. For the purpose of this blog post I have left the default DNG -> Tif parameters on, in order to try and get a like-for-like comparison of a scene with and without PS mode on. It should be noted that the RAW file size for the PS mode images is about 4 times the size as those without it on, as would be expected considering it’s taking 4 images. The jpgs, however, are the same size, so we’ll also look at whether there are noticeable differences after jpg compression.

off_hist

PS mode off

on_hist

PS mode on

Let’s look at a couple of interesting areas of this image, first the photogrammetric target towards the front. Note: The images are not perfectly aligned so here I’ve taken the same cropped region from the image.

The image on the left is well formed, and localization of the centre of the target will undoubtedly be quite simple in the context of automatic detection. On the right, with PS mode  on, we see a somewhat different story, with an apparent graininess surrounding the upper edges of the target. The centre of the target will be difficult to localize correctly. When initially looking at this, I thought it was likely localized to the target itself, considering no new data is really being added by the PS itself. This suggests that for targets at least, PS will perform the same or worse than with it turned off. There appears to be some focusing issues with the image on the right also.

Let’s look at a different region, at the top of the cliff where some color variation would be expected.

test_cliff_off-0

PS mode off

test_cliff_on-0

PS mode on

Pixel shift potentially recovers more of the finer details, but is giving up a lot in the process, as the image appears myopic and out of focus, as well as retaining the graininess seen earlier. The blurriness could be explained by my own error in acquiring the image, as we I was shooting from a tripod without a remote trigger, though the apparent graininess is present throughout many of the collected images, and somewhat resembles a sharpening filter applied on images, leaving a slightly uncomfortable amount of high frequency information within the images.

I decided to try one more image, ensuring the PS version was at the very least of a perceptibly passable quality and run the same tests again:

on_hist

PS mode on

off_hist

PS mode off

Targets:

Bank above bush:

off_bank

PS mode off

on_bank

PS mode on

Thankfully for the second image set things have appeared to work exceptionally well. While the graininess seen on the first target is still somewhat perceptible in the second, the difference is not as apparent. The bank is undoubtedly of a much higher apparent spatial resolution in this image with PS mode on, which is what we would expect. The comparison between images points to the fact that any errors due to blur or lens effects will be amplified by switching pixel-shift resolution mode on, as would be expected. However, the high quality PS images appear to indeed be of higher perceptible resolution. It remains to be seen whether these are differences within the individual images selected, or a result of the PS, or both, but I hope this will become apparent as I sink my teeth in to the data.

Watch this space for more updates as my work progresses!