Predictions, predictions, predictions

I’ve just listened to the latest episode of Alastair and Andrew‘s podcast, scene from above, and the discussion section based around near-future predictions for the Earth Observation (EO) industry, as well as some of the discussion in the news section, was extremely interesting. I’m fully onboard the hype train for machine learning booming in EO, with Andrew seemingly somewhat skeptical.

Before I go into why I think that’s the case, I’ll mention Alastair speaks about a Voyager documentary, the Farthest (I’ve actually just noticed a big Irish producer, crossing the line was involved in production, wahay!). It sounds absolutely incredible, and will go on my watch list, but Alastair’s comments reminded me of an xkcd comic alluding to the fact that the edge of the solar system is difficult to define! I actually really enjoyed listening to their thoughts on Voyager in general, and would love to hear more discussion around the history of EO as well as wider planetary missions – every time I read and think about Corona, for example, I can’t help but be amazed.


Voyager spacecraft (NASA)


One of the main predictions made within the main section of the podcast is that analysis ready data (ARD) will see wider use and release by data providers. We have seen a move towards sentinel 2 ARD and planet have recently released their atmospherically corrected surface reflectance product, I would hope this is an indication that this is quite well developed already!


A figure from Planet’s surface reflectance white paper (source)

On the machine learning (ML) front, I attended a google earth engine workshop at the beginning of this year, and having had fruitful discussions with the host on the project’s directions, I think the iron is hot for ML and the hype justified. In particular, the host spoke about the team preparing tensor flow integration into the platform in time for AGU next year. Having been lucky enough to participate in (albeit not at a competitive level) the planet kaggle competition for classifying image excerpts into one or more classes last year, I have a decent idea of just why there has been a frenzy of research surrounding convolutional neural networks (CNNs) in the computer vision community, and I’m surprised that they haven’t appeared more in EO research.

While Andrew notes that supervised and unsupervised classification has been around and used for decades, the difference between those and deep-learned information is like night and day in my opinion. The competition, past the task presented, gave me a look into how neural networks are transforming image analysis, and how recurrent CNNs on massive scales could be leveraged in an environmental context for things like linking phenological mapping to data which might provide reasons as to why a change is happening with spatial context. Object-based analysis is unparalleled for applications like this, and CNNs are now so easy to use and much better at handling massive data sets than previous methods. Computer scientists are poised to integrate more and more with the EO community as higher resolution data becomes available, and so I feel like when high temporal and spatial resolution open data becomes available multi-disciplinary research will really kick off. Infact, I put together a starter ipython notebook for bird identification, showing just how easy it is using a pre-trained CNN for this application, albeit not with EO data.


Example plot from ipython notebook

This leads to a prediction of my own – as more imaging scientists move into EO, Unmanned aerial vehicle (UAV) and satellite data will need to be better integrated. Currently, there are a raft of problems linking data collected from consumer level cameras onboard UAVs to satellite data, not least of which is radiometric normalization. The demand for higher resolution data from the deep learning end of the community will lead to new standards being introduced for how UAV data is collected and metadata stored (shameless plug). EO platforms will begin to integrate publicly collected UAV data and satellite researchers will begin to collaborate with computer scientists using nearer earth images. We will then see satellites being used as an early warning systems and UAV missions automatically launched off the back of satellite derived information in a range of new applications.

This isn’t a particularly insightful prediction, but one which continuously hasn’t really been addressed. I’m always surprised as to how infrequently satellite and UAV data are used in tandem, but I’m hoping this will change!

That’s all for now, look for my Google Earth Engine blog coming next week, I was blown away by the product and definitely need to do a separate post on it 🙂


Sentinel_bot – now with NIR vision

A quick blog post as I’m very much in the throes of writing! I took a few minutes today to introduce false colour (Near Infrared – Red – Green) images into @sentinel_bot’s programming, so now there’s a 20% chance that an image it produces will be false colour. In the near future I think I’ll introduce other band combinations (such as PCA band combos for mineral contrast enhancement), but for now I’m going to let it sit and appreciate some of what it comes up with, such as the image below.

Source :

Twitter :


NIR – R – G image over Argentina

Neural nets in Remote Sensing

Neural nets, a summary: (The chain rule * your GPU RAM)

Around 2 years ago I remember having a discussion with Jan Boehm about photogrammetry after my first meeting as the shadow wavelength rep on the Remote Sensing and Photogrammetry committee. He mentioned Agisoft, which I was already using and familiar with at the time, but then mentioned the movement in dense matching algorithms towards use of neural nets, mentioning one which had been submitted to the KITTI stereo benchmark.


Disparity map using Žbontar’s methods

This piqued my curiosity, and I remember reading and being quite excited by Jure’s paper. While some concepts were new to me, the use of Convolutional neural networks (ConvNets) and the two types of architecture used to initialize the initial results, before moving towards post-processing using semi-global matching. I remember sinking a great deal of time into reading about the methods, exploring the github and methods used within the core of the paper, and subsequently hounding a colleague who was using a Titan-X for some deep learning work for some time with it.

I remember I took the ideas with me to EGU 2016, and even went to the point of acquiring a data set I thought would be worthy of testing it with from a German photogrammetrist, Andreas Kaiser. Alas, it wasn’t to be due to the hardware limitations and the fact that I wasn’t very familiar with the lua programming language. However I had learned a lot about the nature of deep learning, which I felt was a decent investment of my time.

The reason for this blog entry, however, isn’t to enlighten the reader of my failure to get up to speed with neural nets at the time, it’s much more hopeful than that! Fast forward two years, and development within the field of deep learning has come on leaps and bounds. With serious development time going into TensorFlow, and a beautiful and accessible front end in the form of keras, the python user really does have the tools to apply neural nets to all sorts of applications within image-based studies.

Having learned the basic ideas around neural nets from my initial excitement a long time ago I decided to try and get involved with the community once more. A few months back, a well timed kaggle competition came up which involved image classification, which raised an eyebrow. I contacted an old friend of mine who had just finished his PhD in medical imaging and we set to take up the challenge.


The task for the competition involved labeling satellite imagery

Since starting the task, I feel like I’ve come on leaps and bounds with not only the concepts behind ConvNets, but their architecture and application in the python framework. Whilst we generated lots of code (will be on github in due course), and had lots of ideas floating about, we finished a decidedly average mid-table – this first pass was as much an experience in learning about organisation as well as about imaging science, but it’s made me rethink about using ConvNets in a Remote Sensing/Photogrammetry environment.

Whilst we are seeing more contributions coming out of the community, and the popularity of other less technical concepts like support vector machines have shown I’m hoping to extend my skill set to include all of these in the future. If anyone who happens to be reading this feel the same, don’t hesitate to get in touch!



Django greyscales

Access the application here.

I’ve been learning lots about the django web framework recently as I was hoping to take some of the ideas developed in my PhD and make them into public applications that people can apply to their research. One example of something which could be easily distributed as a web application is the code which serves to generate greyscale image blocks from RGB colour images, a theme touched on in my poster at EGU 2016.

Moving from a suggested improvement (as per the poster) using a complicated non-linear transformation to actually applying it to the general SfM workflow is no mean feat. For this contribution I’ve decided to utilise django along with the methods I use (all written in python, the base language of the framework) to make a minimum working example on a public web server (heroku) which takes an RGB image as a user input and returns the same image with a number of greyscaling algorithms (many discussed in Verhoeven, 2015) as an output. These processed files could then be redownloaded and used in a bundle adjustment to test differences of each greyscale image set. While not set up to do bulk processing, the functionality can easily be extended.


Landing page of the application, not a lot to look at I’ll admit 😉

To make things more intelligible, I’ve uploaded the application to github so people can see it’s inner workings, and potentially clean up any mistakes which might be present within the code. Many of the base methods were collated by Verhoeven in a Matlab script, which I spent some time translating to the equivalent python code. These methods are seen in the support script

Many of these aim to maximize the objective information within one channel, and are quite similar in design so it can be quite a difficult game of spot the difference. Also, the scale can often get inverted, which shouldn’t really matter to photogrammetric algorithms processes, but does give an interesting effect. Lastly, the second PC gives some really interesting results, and I’ve spent lots of time poring over them. I’ve certainly learned a lot about PCA over the course of the last few years.


Sample result set from the application

You can access the web version here. All photos are resized so they’re <1,000 pixels in the longest dimension, though this can easily be modified, and the results are served up in a grid as per the screengrab. Photos are deleted after upload. There’s pretty much no styling applied, but it’s functional at least! If it crashes I blame the server.

The result is a cheap and cheerful web application which will hopefully introduce people to the visual differences present within greyscaling algorithms if they are investigating image pre-processing. I’ll be looking to make more simple web applications to support current research I’m working on in the near future, as I think public engagement is a key feature which has been lacking from my PhD thus far.

I’ll include a few more examples below for the curious.


This slideshow requires JavaScript.


Writing blues

After having been ill the last week and a half I’m currently trying to get back into the swing of writing, which I find is largely the hardest part of research where really it doesn’t/shouldn’t need to be. One thing in particular I find very difficult is starting – I often pore over the first words/sentence for a very long time when I do sit down to write.

One forward step I’ve come to in an attempt to mitigate this is to give myself as many opportunities as possible to start writing. While obviously this could involve carrying a pen and paper around everywhere and waiting for inspiration to hit, I think the practicalities of translating esoteric squiggles and keeping the notes in decent order a bit beyond me, so I rarely give it a proper go.

Enter the bluetooth keyboard, a product recommended to me by my supervisor to ensuring you can start taking notes/writing wherever you are. I was skeptical at first, due to the variable key size and slight faff of connecting via bluetooth to my phone, but after giving it a couple of hours on a recent visit to the RGS I was sold. Currently I’m typing up a version of this blog post on my phone sitting on a train from Holyhead to Chester on the way back to London. I’m getting great pleasure from watching the trees go by after every few sentences!


Product photo from Microsoft’s site

While I know this entry will read like an advertorial, that isn’t the intention, I’m just very wary of the summer’s PhD writing ahead, and am glad to have an excuse to do the lion’s share sitting in a park rather than in my stuffy office! For now, back to writing, though I’m preparing a more technical blog post which should be finished later tomorrow.


Spotted from the train in Wales


Sentinel bot source

I’ve been sick the last few days, which hasn’t helped in staying focused so I decided to do a few menial tasks, such as cleaning up my references, and some a little bit more involved but not really that demanding, such as adding documentation to the twitter bot I wrote.

While it’s still a bit messy, I think it’s due time I started putting up some code online, particularly because I love doing it so much. When you code for yourself, however, you don’t have to face the wrath of the computer scientists telling you what you’re doing wrong! It’s actually similar in feeling to editing writing, the more you do it the better you get.

As such, I’ve been using Pycharm lately which has forced me to start using PEP8 styling and I have to say it’s been a blessing. There are so many more reasons than I ever thought for using a very high level IDE and I’ll never go back to hacky notepad++ scripts, love it as I may.

In any case, I hope to have some time someday to add functionality – for example have people tweet coordinates + a date @sentinel_bot and have it respond with a decent image close to the request. This kind of very basic engagement for people who mightn’t be bothered going to Earth Explorer or are dissatisfied with Google Earth’s mosaicing or lack of coverage over a certain time period.

The Sentinel missions offer a great deal of opportunity for scientists in the future, and I’ll be trying my best to think of more ways to engage the community as a result.

Find the source code here, please be gentle, it was for fun 🙂



Photogrammetry rules of thumb

I’ve uploaded a CloudCompare file of some fieldwork I did last year to my website here. It uses the UK national LiDAR inventory data, mentioned in the post here. I think it espouses lots of the fundamentals discussed here, and is a good starting point for thinking about network design.

80% overlap

This dates way back, and I’m unsure of where I heard it first, but 80% overlap between images in a photogrammetric block with a nadir viewing geometry is an old rule of thumb from aerial imaging (here’s a quick example I found from 1955), and carries through to SfM surveying. I think it should likely be a first port of call for amateurs doing surveys of surfaces, as it’s very easy to jot down an estimate before undertaking a survey. For this, we should consider just camera positions orthogonal to the surface normal (see this post) and estimate a ground sample distance to aid us with camera spacing from there.

1:1000 rule

This has become superseded in recent years, but is still a decent rule of thumb for beginners in photogrammetry. It says that, in general (very general!), the surface precision of a photogrammetric block will be around 1/1000th of the distance to the surface. Thus, if we are imaging a cliff face from 30m away, we can realistically expect accuracy to within 3 cm of that cliff. This is very useful, especially if you know beforehand the required accuracy of the survey. This is also a more stable starting point than GSD, whose quality as a metric which can vary widely depending on your camera selection.

Convergent viewing geometry

Multi-angular data is intuitively desirable to gather, with the additional data comes additional data processing considerations, but recently published literature has suggested that adding these views has the secondary effect of mitigating systematic errors within photogrammetric bundles. Thus, when imaging a surface, try and add cameras at off angles from the surface normal in order to build a ‘strong’ imaging network, to avoid systematic error creeping in.

Shoot in RAW where possible

Whilst maybe unnecessary for many applications, RAW images allow the user to capture a much great range of colour within an image, owing to the fact that colours are written on 12/14 bits rather than the 8 of JPG images. Adding to this, jpg compression can impact the quality of the 3D point clouds, so using uncompressed images is advised.

Mind your motion

Whilst SfM suggests that the camera is moving, we need to bear in mind that moving cameras are subject to blur, and this is sometimes difficult to detect, especially when shooting in tough conditions where you can’t afford to look at previews. Thus, you can pre-calculate a reasonable top speed for the camera to be moving, and stick to that. We recommend a maximum of 1.5 pixels in GSD over the course of each exposure given the literature and as advised by the OS.

Don’t overparameterize the lens model

Very recently, studies have suggested that overparameterizing the lens model, particularly when poorer quality equipment is being used without good ground control, can lead to a completely unsuitable lens model being fit which will impact the quality of results. The advice – only fit f, cx, cy, k1 and k2 parameters if you’re unsure of what you’re doing. This is far from the default settings in most software packages!


I had a few more points in my long list, but for now these 6 will suffice. Whilst I held back on camera selection here you can read my previous camera selection post for some insight into what you should be looking for. Hope this helps!