Friday, 20 May 2016

Bristol (& New Brighton) Buildings from Lidar

West front of Bristol Cathedral
West Front, Bristol Cathedral
One of the buildings where we could use Lidar data to enhance its representation in OSM
At OpenData Camp 3 over the weekend I asked John Murray if he could give me a set of polylines extracted from the Environment Agency Lidar Open Data. Do read John's amazing post about the tools he has built for doing nifty things with Lidar data: Turning Lidar Data into Actionable Insight.

I thought if might be a bit of fun to actually show directly how opendata produced by one of the ODCamp sponsors might end up in OpenStreetMap.

In practice John got keen over the lunch break and wrote a bit of code to turn his polylines into polygons. So just before the last session of the day kicked off I had been sent a shape file for the 1km Ordnance Survey gird square where the meeting was taking place.

We had one initial teething problem that the data was out by 1 km, but notwithstanding that it was very simple to perform some simple manipulations in the off-line OpenStreetMap editor JOSM.

JOSM can read shapefles (and some other geo-formats such as geojson too) and automatically transforms these into OSM elements projected in WGS84. Therefore the additional data manipulation steps were pretty trivial:
  • Select all way elements (using a type:way search)
  • Add a source=EA Lidar Open Data tag
  • Select all type:relation elements to find multipolygons
  • Add building=yes tag
  • Select all way elements which were not part of multipolgons (type:way and not child type:relation)
  • Add building=yes tag
  • Select all way elements again and simplify them.
 The image below shows how this looks in the editor


Now this is all pretty amazing, and if you read John's blog post there's lot more info which can be gleaned. However a close inspection of the data still shows a sizeable number of artefacts which would need cleaning up. Some John has dealt with in the intervening few days, but turning any automatically extracted feature into something of the sort of quality which can be one in OSM is another matter: and is not too dissimilar to the points I made some time ago about OpenMap Local.

For me the real advantage is that it's a major step in making if more feasible to use Lidar data to enrich OSM data. For instance data on roof orientations could be combined with the algorithms & crowd-sourced validation methods from OpenSolarMap. I hadn't realised until listening to John's talk just how valuable gable or eaves heights are in building datasets. It certainly persuaded me that they belong in OSM.

Another downside is that it takes a whiz like John to create this software and it makes use of a powerful machine, powerful algorithms, optimised hardware & proprietary storage. I have therefore spent a little time this week looking again at what is available in QGIS to do similar (but much less powerful) manipulation of the Lidar data.

Basic transformations of Lidar have been described elsewhere (for instance see Chris Hill's posts) so I won't dwell on them here. Suffice it to say I presume that the following have been created for a given area:
  • Combined Digital Surface Model (DSM). I usually do this as a virtual time set (can be done directly in QGIS)
  • Combined Digital Terrain Model (DTM). As above.
  • Delta of the two. DSM-DTM. This gives things (buildings, cars, trees etc) which are elevated above ground level.
To get somewhere near what John's approach involves ideally requires:
  • Filtering out shorter objects (mainly cars, garages & some street furniture)
  • Filtering out smaller objects (mainly trees)
  • Edge detection
  • Polygonisation
In practice I found it relatively easy to do the first & last and did not find a simple way of doing the other 2 in QGIS (although in part that might be because I'm short of disk right now).

The other two can be achieved easily:
  • Filtering by Height: this is merely another raster calculation using the QGIS Raster Calculator. In my test area (New Brighton on the Wirral, OS grid ref SJ3093) most houses are Edwardian and much higher than 3 metres, whereas garages are usually a touch over 2 metres. I therefore used 3 metres as a cut-off.
  • Polygonisation. I used the height filtered data directly with the Raster...Conversion...Polygonize option in QGIS. This is a much cruder and more naive method than I was hoping to use, but there it is.
I show the results of these steps below (in separate images to allow easier inspection & then combined).

Lidar Height data (DSM-DTM) filtered for >3m

Extracted & OSM Building polygons compared
(garages are deliberately excluded from OSM data)
Height data combined with Polygons


Firstly it's worth noticing a few features from the raw height data:
  • Most buildings are tall, usually in excess of 8 metres (and probably at least that height at the gables).
  • There are a limited number of lower height buildings. The most obvious ones are near the top of the image and include two small factory premises N of the railway & the platform canopy of the railway station. S of these the road bridge over the railway is obvious; and immediately to the SE there are apparently two largish buildings of low height, albeit quite a bit of noise in the height profile. (These are, in fact, Victoria View a development of flats which halted for several years). Further S still there are a small number of bungalows.
  • Terraces with a lower rear service area are obvious.
  • There are a significant number of linear features above 3 m in height. Most look to be walls, and indeed garden walls in the area tend to be high as most gardens are small & given building heights would tend to be overlooked.
  • Isolated trees are obvious in one or two back gardens
  • Larger groups of trees are equally obvious along the railway line (& elsewhere)
  • Swirly patterns in the 3-4 metre range occur in a number of places. These are mainly scrub (mainly gorse) or shrubberies.
  • There are still parked vehicles giving returns in the 3-4 metre height range. 

Edwardian Streets S of Mount Road New Brighton (Dovedale & Langdale Rds)

I include a couple of photos of streets in the area to help with context. I would recommend strongly Russ Oakes' work documenting suburban streets all over Merseyside for a much broader perspective.
Junction of Dudley & Hamilton Roads, New Brighton

Comparing the extracted polygons with OSM (and ignoring some OSM data which is missing) shows:
  • There is a fairly constant offset of OSM data (presumably inherited from the Bing imagery).
  • Building footprints are broadly comparable
  • Small gaps in terraces are resolve much better by tracing.
  • Some detail has not been added to some of the terraces in OSM which are still drawn as plain rectangles.
  • It's certainly possible to spot missing features & use Lidar data as an aid to add them in (notably Victoria View, but the W part of the development was started after the Lidar data.
Now as for deriving data to enhance OSM there's a fair more bit of processing needed.

Absolute building height is relatively easy, one just needs to find the maximum height within a (location corrected) OSM polygon. Generating the other more useful Simple 3D building (S3DB) tags is rather more involved, and certainly I have the impression that QGIS would be a fairly clunky way to do things. I really hope that some more technically-minded OSM folk can take inspiration from John's ideas and start thinking about tools to mainpulate Lidar data specifically for OSM.

There is no doubt that the Environment Agency Lidar data was one of the most significant open data releases last year. Furthermore it is likely that other agencies & local government bodies will make Lidar available more widely in the near future. For instance I believe much data from Kanton Zurich is open, including Lidar. This example shows the extensive slumping caused by peri- and post-glacial phenomena in the woods near Bergietikon: so this is a reminder that it's not just buildings which are of interest.

One last thing to note is that there's lots one can do with this data immediately (the subject of John's original article). Working how to add this data to OSM begins to look not dissimilar to creating authoritative datasets. It's of course worth spending time working out how to do this because once in OSM the data is potentially available for a multitude of purposes.

Wednesday, 4 May 2016

Where have all the woods gone from Google Maps?

Very recently there was a nice post by Justin O'Beirne about the cumulative effect of changes to the cartography of Google Maps.  Richard Fairhust summarised his views on twitter:



This is just my (very) minor contribution to the discussion.

The Botanical Society of Britain & Ireland (BSBI) uses Google Maps as the background to their maps of plant distributions. Over the past couple of weeks I've been using it a lot because I've been interested in two things:
  • Where I might fund particular plants relatively close to where I live;
  • Which plants I see might be of interest to the county recorders.
As at this time of year many of the botanical highlights are to be found in ancient woodlands it's damn useful to see where the woods are when assessing the BSBI records. That's why I noticed woods disappearing from the Google cartography as one zooms in.

This screenshots shows successive zooms of an area in central Nottinghamshire which includes Clumber Park an two old woods, Gamston & Eaton Woods. The latter two are centre right above the village of Askham.



All woodland just disappears between these two zoom levels.

Here's the active map so one can play with zooming in & out.



Losing woods at high zoom levels is another example of loss of functionality. In practice it makes the maps layer useless for interpreting botanical data: I have to resort to using the satellite layer. Even that is not always easy because sometimes fields also appear dark green.

Google does use a couple of other green shades for things like parks, golf courses, and possibly nature reserves (see Sherwood Forest NNR near Edwinstowe). I don't know if these come on and off in a similar arbitrary pattern.