Geodiversity

Radiant Earth, whose CEO Anne Hale Miglarese I was lucky enough to see speak at the RSPSoc conference last year, partnered with Amazon in order to provide more ‘geodiverse’ training data for machine learning models. I think this is timely, as the AI4EO paradigm sets in. The availability of Sentinel 2 Analysis Ready Data from s3, as well as the ability for partial reads of this data using gdal, is the preferred option vs. Google Earth Engine for me for geodevelopment, so I’m delighted on these continuing data releases. I’ve been reading about rastervision, and look forward to sinking my teeth into this data with that as a supporting tool to see what kind of learning can be done!

Geodiversity is required for reliable modelling (source)

Past Sentinel 2 data, there’s so much opportunity  to shift thinking on how to develop AI4EO models, extending to other metrics  such as air quality (for instance from Sentinel3 SLSTR).

Keep an eye on this space – I’ll do an jupyter notebook or similar exploring the data once I get the chance!

Advertisements

Earth from space

The BBC have released the first of a documentary series focusing on Remote Sensing, and how it has changed/can teach us about out changing planet. It’s definitely a tough subject to fill whole episodes with, so the style is somewhat blended between satellite imagery, and storytelling on the ground, which makes for a very different kind of wildlife documentary experience.

I’m particularly curious as to how they produced the ‘superzooms’, which involve both zooming into, and out from, individual elephants in Africa to a continent wide view, as  they’re extremely well done. I’m a bit skeptical as to how much space cameras are involved in videoing shaolin monks, and am curious which satellites would even have the capability for this – maybe Vivid-i could capture a short video sequence, but the resolution wouldn’t really be high enough to discern individuals, and the recently defunct Worldview-4 would only be able to capture stills. Regardless, it’s really a well paced, emotional episode which I enjoyed immensely.

worldview.jpg

Sample from Worldview-4, available here

The series continues next week with an episode on patterns – the dunes of Namibia are an area whose beauty I only really discovered through sentinel_bot and I’m looking forward to learning more!

 

EGU 2019

EGU this year was a bittersweet affair, as I actually didn’t make the conference myself, despite having two posters presented on my behalf. I enjoy EGU, but this year my aim is to get to a few new conferences, and having already attended the amazing big data from space conference (BiDS) in Munich in February, I’m hungry to branch out as much as possible. Also on the agenda this year is FOSS4G (I have always wanted to go!) and RSPSoc’s conference in Oxford (this is one I think I will go to every year).

That being said, I did still submit two abstracts, both for posters sessions, with colleagues
of mine presenting on my behalf. The first was another extension of my PhD work, which focused principally on image quality of data collected in the field for use in photogrammetric work and it’s effect on the accuracy and precision of photogrammetric products.

This extension used new innovations within the field to further dive into this relationship, by using Mike James’ precision maps (James et. al 2017). In essence it investigates how stable sparse point clouds are when systematically corrupted with noise (in all of the camera positions, parameters and points within the cloud). His research tries to refine a big unknown within bundle adjustment using structure-from-motion, how do we account for variability in the precision of measurement when presenting results. Due to bundle adjustment’s stochasticity, we can never guarantee that out point cloud accurately reflects real life, but by simulating this sensor variation, we can get an idea of how stable this is.

egu2019_final.png

Pdf version available here

In all, the research points to the fact that compressing data is generally a bad thing, causing point clouds to be relatively imprecise and inaccurate when compared with uncompressed data. It would be interesting to extend this to other common degradations to image data (blur, over/underexposure, noise) to see how each of those influences the eventual precision of the cloud.

Secondly, I submitted a poster regarding a simple app I made to present Sentinel2 data to a user. This uses data from an area in Greece, and geoserver to serve the imagery behind a docker-compose network on an AWS server. It’s very simple, but after attending BiDs, I think there is an emerging niche for delivery of specific types of data rapidly at regional scales, with a loss of generality. Many of the solutions at the BiDS were fully general, allowing for arbitrary scripts to be run on raw data on servers – something comparable to what Sentinel-hub offer. By pruning this back, and using tools like docker-compose, we can speed up the spin-up and delivery of products, and offer solutions that don’t need HPCs to run on.

greece

Sample of the app

Lastly, I’ve simplified my personal website massively in an attempt to declutter. I’ve just pinched a template from Github in order to not sink too much time into it, so many thanks to Ryan Fitzgerald for his great work.

aboutme

That’s all for now, I’ll be writing about KisanHub in the next blog!

Web development is my job

I think it’s high time I restarted this blog, rather than let it disappear into oblivion. For the last year I’ve been working in Cambridge with an agricultural technology company called KisanHub, who aims to introduce a new wave of efficiency into crop monitoring and the food supply chain. I’ll do a separate post on the company, but for this blog post, I’m going to give general updates on skills I’ve acquired in the year, and what new skills I would recommend budding EO web developers should accrue in an attempt to demystify some of the jargon in the webdev/EO worlds.

Django/Flask

Python is the de facto standard language for geoscientific computing, and so it makes sense to learn a web framework in this language. Django and Flask are both good options for common taks in web development – a great example of their power of flask is terracotta, where you can go from local files to a full blown interactive environment in one command.

myimage_small.gif

Terracotta let’s you make an xyz server from a directory of geotiffs

Docker

By far and away the biggest revelation in the way I work, docker takes virtual environments taken to an extreme. In a naive sense, it lets you download and run a different computer – therefore stripping down all the barriers of system dependancies (gdal!), environment dependancies (pygdal!) and operating systems. The docker-compose command lets you run many of these computers in a network, and exchange information with one another. The kartoza docker-geoserver project is a great place to start for a demonstration of how easy it is to get a niche piece of software up and running. I generated a demonstration project here based on this network (source)!

myimage_small.gif

Docker–geoserver based project

Kubernetes

The natural extension of docker-compose, kubernetes let’s you efficiently run the aforementioned docker-compose network by declaring how much resource (cpu/gpu/memory) is given to each part of the network, and define rules for how the network scales/shrinks under certain conditions. It takes away most of the headache of having to manage servers and network configuration (my nginx config knowledge needs some work!), which I am very grateful for. Minikube can be used to run kubernetes networks locally, but seem to consume far more resources than docker-compose, so I usually use that at the final stages.

Geoserver

The grandfather of tileservers, I hadn’t used Geoserver much before this year, but am impressed with the very active community (I’m on the mailing list), and once bolted on to docker how easy it is to get started. I learned quickly it’s very easy to misuse, and so have spent the last few months properly learning about what it can and can’t do, and it’s REST interface. I think it’s a good starting point for geoscientists with little development experience, as everything can be fully controlled from the GUI, which forms a good base for beginning to manage it through REST.

Summary

I think these are the most significant technologies I’ve adopted in the last year, and would encourage any budding (or budded) geoweb developers to invest time into them. In the future, I’ll be writing about my job, my continuing research interest and my now significant commute (spoiler: it’s London -> Cambridge).

SentinelBot upgraded

I’ve been on a webdev kick since starting a new job, and have recently upgraded SentinelBot as a result. It now filters snow scenes less often and can handle atmospherically corrected products – I’ll be updating the github repository, and will be writing a post about my current job soon, but for now feast your eyes on some Sentinel goodness 🙂

 

 

Predictions, predictions, predictions

I’ve just listened to the latest episode of Alastair and Andrew‘s podcast, scene from above, and the discussion section based around near-future predictions for the Earth Observation (EO) industry, as well as some of the discussion in the news section, was extremely interesting. I’m fully onboard the hype train for machine learning booming in EO, with Andrew seemingly somewhat skeptical.

Before I go into why I think that’s the case, I’ll mention Alastair speaks about a Voyager documentary, the Farthest (I’ve actually just noticed a big Irish producer, crossing the line was involved in production, wahay!). It sounds absolutely incredible, and will go on my watch list, but Alastair’s comments reminded me of an xkcd comic alluding to the fact that the edge of the solar system is difficult to define! I actually really enjoyed listening to their thoughts on Voyager in general, and would love to hear more discussion around the history of EO as well as wider planetary missions – every time I read and think about Corona, for example, I can’t help but be amazed.

far

Voyager spacecraft (NASA)

 

One of the main predictions made within the main section of the podcast is that analysis ready data (ARD) will see wider use and release by data providers. We have seen a move towards sentinel 2 ARD and planet have recently released their atmospherically corrected surface reflectance product, I would hope this is an indication that this is quite well developed already!

planet.png

A figure from Planet’s surface reflectance white paper (source)

On the machine learning (ML) front, I attended a google earth engine workshop at the beginning of this year, and having had fruitful discussions with the host on the project’s directions, I think the iron is hot for ML and the hype justified. In particular, the host spoke about the team preparing tensor flow integration into the platform in time for AGU next year. Having been lucky enough to participate in (albeit not at a competitive level) the planet kaggle competition for classifying image excerpts into one or more classes last year, I have a decent idea of just why there has been a frenzy of research surrounding convolutional neural networks (CNNs) in the computer vision community, and I’m surprised that they haven’t appeared more in EO research.

While Andrew notes that supervised and unsupervised classification has been around and used for decades, the difference between those and deep-learned information is like night and day in my opinion. The competition, past the task presented, gave me a look into how neural networks are transforming image analysis, and how recurrent CNNs on massive scales could be leveraged in an environmental context for things like linking phenological mapping to data which might provide reasons as to why a change is happening with spatial context. Object-based analysis is unparalleled for applications like this, and CNNs are now so easy to use and much better at handling massive data sets than previous methods. Computer scientists are poised to integrate more and more with the EO community as higher resolution data becomes available, and so I feel like when high temporal and spatial resolution open data becomes available multi-disciplinary research will really kick off. Infact, I put together a starter ipython notebook for bird identification, showing just how easy it is using a pre-trained CNN for this application, albeit not with EO data.

birds.png

Example plot from ipython notebook

This leads to a prediction of my own – as more imaging scientists move into EO, Unmanned aerial vehicle (UAV) and satellite data will need to be better integrated. Currently, there are a raft of problems linking data collected from consumer level cameras onboard UAVs to satellite data, not least of which is radiometric normalization. The demand for higher resolution data from the deep learning end of the community will lead to new standards being introduced for how UAV data is collected and metadata stored (shameless plug). EO platforms will begin to integrate publicly collected UAV data and satellite researchers will begin to collaborate with computer scientists using nearer earth images. We will then see satellites being used as an early warning systems and UAV missions automatically launched off the back of satellite derived information in a range of new applications.

This isn’t a particularly insightful prediction, but one which continuously hasn’t really been addressed. I’m always surprised as to how infrequently satellite and UAV data are used in tandem, but I’m hoping this will change!

That’s all for now, look for my Google Earth Engine blog coming next week, I was blown away by the product and definitely need to do a separate post on it 🙂