Thursday, 26 December 2013

Assigning addresses from Land Registry Prices Paid data

After the disappointment of the Land Registry INSPIRE land parcels, it is nice to report a large and useful open dataset from the same source: the Prices Paid data (LRPP). These are the actual prices paid for houses and flats in England and Wales from late-1995 or thereabouts to the present.

New Residential Roads in England and Wales
Roads were identified from Land Registry Prices Paid data
and matched by name to OSM highways within 2km of postcode centroid

Monday, 23 December 2013

Mapping off-piste ski routes

The tragic death of a 16 year-old boy in Lech on Wednesday (18th December 2013), prompted me to look at aerial photos and various maps of the Arlberg area. As is usual with news reports of such accidents the information on the precise locality is fairly sketchy. This is always annoying to anyone who knows an area, and particularly if one goes off the marked pistes, because detailed knowledge of the route and conditions are essential to understanding the news. (Compare this with the volume of knowledgeable and detailed commentary on cycling deaths in London).

The boy was skiing with his father and his twin brother and a local ski guide. They started an avalanche which carried one of the boys, his father and the guide down the mountain around 150 metres. Fortunately the local news (understandably) provided a bit more detail: "Bereich Madloch-Fauler Stock Richtung Stierlochbach" and a photo of the avalanche.

The area NW of Madlochjoch on OpenSnowMap (View Larger Map)
(close the overlay to see the map)

The whole Arlberg area is well-known for a plethora of named off-piste ski routes, to quote "... every variant, however rarely skiable has a name...", so there was a good chance that this accident happened on a well-known run: particularly as it was so early in the season. As the avalanche debris fan was next to some Pinus mugo scrub the number of locations could be reduced. I own two books on Arlberg off-piste routes, but neither was to hand. I did find this fine photo with the main itineraries marked:

Overview of off piste routes accessible from Madlochjoch
Author: Roland Wasmayer, Source: via OutdoorActive)
Details of the accident will be released by the Vorarlberg Avalanche service. These short accounts usually have an aerial photo giving an overview of the accident site, such as this one which was very wdely reported because it left a Dutch Prince in a coma from which he never recovered..

I mapped a little bit of the scrub on OpenStreetMap and also found this image of the entry to the Stierfall route. I suspect the the accident happened in either Heuwet or Heuwetrinne: both so frequently skied in mid-season to be heavily mogulled. None of these itineraries are mapped on OpenStreetMap, as is true for other itineraries throughout the Arlberg ski area.

Strengen 053
Entering the Stierfall, photo Stegaku
In April 1992 I was lucky enough to have great weather in St Anton after an exceptional snowfall in March. Conditions meant that there was wonderful spring snow for a couple of hours in the morning, but everywhere routes which are rarely skiable were fully covered. It was astonishing to see ski tracks in the most unexpected places: particularly on the W side of the Arlberg pass between St Christoph and Stuben. Many of these routes were skiied long before mechanical uplift existed: nowadays the position of ski lifts determines which are favoured.

One of the odd things about this is to discover long vanished ski lifts. A post on Edward Tufte's forum alerted me to a website featuring Heinrich Berann's ski maps and panoramas. (Berann may be familiar to others as the illustrator / cartographer of Ocean Floor maps). Berann's style is very recognisable. 

Ski map of St Anton-am-Arlberg by Prof H.C. Brann

What is particularly interesting is that he made ski 'maps' for the Arlberg area from 1937 through to the late 1970s, and therefore it is possible to trace some of the evolution of the area's infrastructure. Most noticeable is that until around 1980 there was a lift on the W side of the Arlberg road at St Christoph. Also noticeable is the curious way the grading of ski runs has changed: particularly true of runs now classed as freeride. Runs which were once just graded as black turned into ski itineraries, and some have just disappeared.

Of course someone has thought of mapping freeride territory: Freeridemaps of Altstätten, Switzerland. The sample image on their website is the Arlberg area around St Anton-am-Arlberg. In the main the map shows areas appropriateness for different ranges of skill and experience: but as far as I can tell largely based on objective factors, such as safety of run-outs, and likely consequences of a fall. Above all it is based on the steepness of the land.

The Swiss and Austrian avalanche service also produce map overlays based on steepness to show avalanche risk: in the Swiss case this data is available by subscription in the White Risk app for Android and the iPhone.

Mapping avalanche risk on it's own is a challenge. There are numerous different factors to represent: altitude, age and depth of snow pack, recent changes to snow pack, aspect, underlying terrain (steep grass slopes often avalanche). In writing this post I came across some interesting papers on the subject of the cartography of avalanche maps.

Profile of Snowpack derived using DAVOS ELF SNOWPACK program
One of the more intriguing ideas is how the history of the snowpack at a point can be displayed (see above). Quite how such data might be transformed to convey information on a map is beyond me.

Despite the great advances made in the past year with OpenSnowMap there is considerably more which we could do to provide information relevant to freeride and back-country skiing, and ski mountaineering.
  • Grading. In my view the current use of a single category of freeride downhill routes is misleading, Many off-piste guides such as those for the major French resorts published by Editions Vamos, or the similar Polvere Rosa covering the south side of the Monte Rosa massif, make use of grading schemes (4 classes of technical difficult in the former, 7 in the latter which includes some extreme itineraries; and 4 classes of objective dangers). I am hesitant to mark freeride areas which I know because of the wide range of difficulties and objective dangers likely to be encountered.
  • Routes. To a certain extent showing routes spoils some of the key aspects of being away from the piste. Picking ones own line, both in the light of one's ability and safety of surroundings is part and parcel of developing the skills and awareness necessary. In many places (Albona at Stuben, most of Les Grands Montets below 2800m, Pavi in A-basin) all that is needed is to show that the area is free-ride territory. In others entry lines and exit routes may not be immediately obvious and are worth showing (it is not uncommon for these to be the least pleasant part of a route), particularly if there are cliffs or other hazards lower down (e.g., above Le Fornet, Val d'Isere).

    Seracs on Glacier des Rognons, Les Grand Montets by LensScaper, on Flickr
    IGN Geoportail shows what's underneath the snow.

  • Objective Hazards. If more freeride terrain is to be added to OpenStreetMap there needs to be much more mapping of the obvious objective hazards: cliffs as mentioned above are the prime example. Others might include: streams (which may not be frozen enough to bear loads), known crevasse and serac fields (mapping individual crevasses can wait a while), 'bullet-proof' ice on glaciers, and notorious avalanche gullies.
  • Safety. Obviously marking ski routes creates some concerns over people placing too much reliance on such information. In reality people undertaking such routes require a minimum level of skill, equipment, knowledge and experience.
Freeride and backcountry skiers need a sizeable amount of cartographic information: some is persistent, much of it is transient. The ability to convey this data in suitable maps is a significant and interesting challenge. The flexibility of OSM data and technologies suggest to me that we can play a role. However, we need much more base data: in particular GPS traces, and good perspective photos of routes with suitable open licenses.
Note: This post contains a number of images which are copyright by their respective authors, and are not licensed under creative commons. Normally I try and only use CC images, but in this case suitable images were not available. Please respect the rights of the image owners.

Monday, 9 December 2013

building=collapsed : history of an OpenStreetMap tag

The collapsed Maxima XX Supermarket in Riga
Source: Valenciano via Wikimedia Commons. CC-BY_SA
Today I noticed a local news item about a pub, The Bridge, which caught fire on Saturday. Apparently it is so badly damaged that it will need to be demolished. Naturally I checked OpenStreetMap, but the changes had already been mapped. This is just another example.

There are a number of other recent instances where the badly damaged buidlings' changed circumstances have been mapped very promptly on OpenStreetMap. Notable example are: The Clutha Vaults, hit by a helicopter in Glasgow and the collapse of the roof of the Maxima XX supermarket in Riga.

The normal changes are to either remove any amenity tags (usually by pre-pending them with old_, which facilitates finding the location), mark the surrounding area with landuse=demolition, or landuse=construction, and to change any building=* to building=collapsed. The process now occurs as swiftly as the creation of the associated article on Wikipedia: naturally there is substantial synergy between the two processes.

However, the building=collapsed tag did not evolve in order to better illustrate news events.

Presidential Palace, Port-au-Prince after 2010 earthquake.
Source: Wikimedia Commons CC-BY-SA.
It's possible that it was in use earlier (an exercise in trawling OSM's history files), but my memory is that it really entered OSM within an hour or so of availability of the first post-'quake imagery of Port-au-Prince in January 2010. We had a discussion on IRC (no doubt someone will have squirrelled away archives of IRC logs from the time) about the best way to tag this. I was for something slightly more elaborate: largely to separate the type of building from its intactness, but the simplicity of building=collapsed won over any longer term semantic issues: and this is the tag which is used. Currently it has 53,622 instances on OSM.

Central Port-au-Prince showing collapsed and damaged buildings, 17 January 2010
The Presidential Palace is top right.
Source: Haiti Earthquake Damage Map

When we tagged collapsed building and debris on roads forming barriers I don't think that we had any real expectation that this data would be used: the first responders had no real idea about the likely accuracy of such data. In practice standard mapping of Haiti was OSMs most useful contribution at the time.
However, over time, OpenStreetMap's capabilities and credibility have increased. One factor is that after a major disaster OSM can deploy a significant number of mappers who already have experience of interpreting aerial imagery after catastrophes. Another, as in the Philippines recently, first responders, such as the Red Cross, are seeking to use OSM directly.

Most of these entities have been mapped very recently as part of the concerted effort to provide high quality maps for humanitarian relief efforts for Typhoon (Yolanda) in the Philippines.

Collapsing house, Babbacombe
  © Copyright Derek Harper and licensed for reuse under this Creative Commons Licence.
Can we learn anything from how usage of this tag has evolved?

For me these are the significant points:
  1. Simplicity prevailed over exactitude in the tagging. We still don't cleanly separate building type and status. This is not just because the simple tag is much easier to remember and apply: reusing an existing tag often means that downstream data consumers require little change to meet the particular humanitarian situation. An additional tag for building status would have required a whole host of changes to allow simple maps to be rendered. Using building=collapsed meant that one rule needed to be added and maps (see image above) were being produced within hours of the tag going into use.
  2. Tags for major disasters often reflect things which happen anyway, but which in the normal course of things are unusual. Buildings collapse from time to time: from storm damage, after fires, from structural defects and so on. It's just that for those of us living in North America and Europe these are relatively rare events.

    The advantage of a worldwide database is that, however rare certain events are locally, they will be happening somewhere and if there is a mapper there they will want to know how to tag it. Although in this case the tag was created by armchair mappers. I'm sure that there are plenty of tags already in use for particular situations somewhere in the world which would fix tagging problems in my local patch.

  3. Tags persist. Once a tag enters widespread usage it is very hard to kill. The OSM wiki is full of deprecated tags, XXX 2.0 schemas etc. However well-intentioned these ideas just don't match how tagging works in practice. Despite 3 years of advocacy of detailed tagging guidelines for public transport we still have plenty of highway=bus_stop tags. (More on this type of problem anon).
In conclusion, a tag developed to meet the particular circumstances of the 2010 Haiti Earthquake has been found to be useful outside its original scope to the extent it's now entered the tagging vocabulary of the local patch mapper. Mapping disasters isn't just a good way to try and help other people who are suffering, but it hones mapping skills which can be applied not only in other disasters but in everyday mapping.

Sunday, 8 December 2013

Food Hygiene Open Data : an easy way into mapping addresses and postcodes

I've written a bit about the Food Safety Agency's Food Hygiene Open Data (henceforward FHRS) before, but as it was the focus of my hacking at last weeks London Hack Weekend, it's about time it had a dedicated post.

FHRS premises spread out along a line normal to shortest line to associated street (See article for details)
Data (c) from OSM contributors 2013, and Food Hygiene Rating Scheme (OGL)

To recap the salient points about FHRS data are:
  • It's pretty comprehensive, covering most local authorities in the UK.
  • It represents more than 15% of all postcodes.
  • Most records come with full address data and a location (the postcode centroid).
  • Premises (shops, pubs, cafes etc.,) are grouped into categories which map fairly easily to OSM tags. 
  • Daily updates. High data currency (last reviewed date on all records).
  • It's Open Data under the plain vanilla Open Government Licence.
For me the obvious uses of this data with OpenStreetMap are:

Wednesday, 4 December 2013

British Postcodes on OpenStreetMap

British contributors to OpenStreetMap are known for an apparently unhealthy obsession with postcodes.
London Postcode Sectors with more than 5% of regular postcodes mapped on OpenStreetMap.
See image for copyright notices. All completeness percentages are shown below.
We are not alone in Britain: there are many groups who need access to postcode data. This is because postcodes in the United Kingdom provide an excellent proxy for geolocation, and as such are widely used: in SatNavs; for geodemographics (such as Acorn, Mosaic, and the (open) ONS Output Area Classfication ); in a host of analytic applications, as well as their more prosaic role in delivering the post. In 2010 postcode centroids were made available through the Ordnance Survey Open Data scheme, under the brand CodePoint Open. Subsequently it was found that the license associated with this data prevented it being used directly in OSM. More recently the Office of National Statistics have released an (identical) data set which is not encumbered by the license of CodePoint Open.

Monday, 2 December 2013

Harry we missed you! London OSM Hack Weekend

Thura (half out of picture, sorry), Dan, Matt and Shaun do a "Harry"
I attended my first ever OSM Hack event over the weekend. I knew I needed to be down in South East England around the end of November so I managed to schedule my plans to include the hack event at AOL/MapQuest London offices. My own contribution was slight because I was still sorting out moving data from my old laptop to my new one, and didn't really achieve this until Sunday morning.

Andrew, Thura, Dan, Matt, Shaun, Grant, Serge, Mick, me (standing), Jon
(photo Simon, Cyclestreets)
The rest of this post is merely a collection of quick bullet points about things which happened, were talked about and so over the two days.
  • The major thing that was done on the Saturday was updating the main OSM website: most OSM users should have noticed these changes by now. Not everyone is happy (although this is a given for any change on the OSM site), and following SomeoneElse's comments about the browse pages I have to agree with him that giving the map more space reduces the usefulness of these pages. However, I am equally sure that most glitches will get ironed out quite quickly.

  • Dan worked on getting offline maps running on Firefox phones. (Several of us were interested in Dan's digitizing tablet too).

  • Dan, and perhaps others, did some work on the HOT OSM Tasking Manager.

  • Frederik brought a case of Club Mate along. To my surprise I found it quite palatable.

  • Half of OSM London are getting married next year. Congratulations to Shaun and Andrea, Harry and Francine, & Derick and Morag.

  • Simon, Shaun and I had a long chat about whether OSM can be used for sensible multi-modal transport planning.

  • Robert was working on That Shouldn't be Possible, his nifty tool for finding bad map data by comparing OSM data with real GPS traces of routes.

  • Serge, Robert, Grant and I had a passionate, but I hope, good-natured, debate about the FDA requirement that 23andMe's DNA sequence kits need to be licensed. I plan to blog about genetics, open data, big data and related topics real soon now: it was an opportunity to rehearse some of my arguments.

  • Kat Arney - embryologist, science communicator, journalist, harpist, and knitter - was around on Saturday morning. She interviewed most of us: so one or two of the more coherent statements may appear on a BBC Radio programme in the New Year. Her parting shot when the pizza arrived was that we lived up to all the stereotypes of hackers! (Male, Club Mate drinking, pizza eaters.)

Food Hygiene postcodes buffered to 100m matched to OSM named highways.
(Note NG8 2AE which does not find any highways).
(Image contains copyright Ordnance Survey, Royal Mail and OpenStreetMap open data)
  • MickO and I made a start on working with Food Hygiene Open Data. Mick wrote a parser for the XML and then generated a shape file from a subset of the data. I (eventually) managed to match FHRS postcodes to CodePoint Open postcodes and then find all OSM streets within 100 metres. Mick tried inputing the FHRS data into Nominatim, but for some reason Nominatim won't match the full addresses from FHRS. (I regard this dataset as an excellent exemplar for trying out a host of issues with fuzzy matching of OSM data to external data and conflation).

  • A wikipedian, whose name I didn't catch, visited late on Sunday afternoon. Like Kat, who has written a chapter about my former boss in a forthcoming book, he has been involved in improving the coverage of the work of female scientists in public media. (Wikipedia have an initiative and a lot of work was done recently on Ada Lovelace day). He wanted advice on what information he could collect and in particular about geotagging photos.
MapQuest (Matt) looked after us well, and we had a couple of enjoyable (and not particularly boozy) sessions in pubs in the evening. All-in-all a very interesting and diverse couple of days. Look forward to doing it again late January, early February.