Thursday, 18 September 2014

OpenStreetMap at the UK Open Addresses Sympoosium

I attended the Open Addresses Symposium organised by Jeni Tennison of the Open Data Institute last month. This brought together a host of people and organisations interested in having an open alternative to the Postcode Address File (PAF).

Somewhat foolishly I'd suggested to Harry Wood that I might speak about addresses on OpenStreetMap.

Addresses mapped on OpenStreetMap in Britain
Density of address mapping in (southern) Britain on OpenStreetMap by local authority
(Northern Scotland not shown because little data, full map)
See text on map for full explanation.
I was glad to see that my talk was relatively late on in the day: the audience were unfamiliar and many of them came from large organisations., so I appreciated the chance to get an impression of them.

Later on I began to think this was a bit of a poisoned chalice. I was scheduled immediately after Bob Barr. Anyone who has heard Bob speak (sadly his great talk at SotM-13 was not successfully videoed) knows that he's a very accomplished and passionate speaker. A hard act to follow! Making sure that I managed to keep the momentum up after Bob's talk meant that I didn't pay a huge amount of attention to the talks immediately preceding mine.

You can check out my slides below or download them from Slideshare.

The summary message was:

In particular I wanted to make sure that the audience understood that a lot of OpenStreetMap data is created by a small number of people, often people who have become highly skilled at collecting the data they do. To that end I created the map at the head of the blog.

This is mainly intended to make at polemic point. Not every significant contributor to the bigger bubbles is named (at least in part because I don't know them all). This should be apparent by the names I chose to represent London (Tom, Derick and Harry). The key point is that it's not much of a crowd when I can more or less name the contributor directly.

The same phenomenon occurs with other 'crowd-sourced' data sets. If I look at a map of records for any species of fly (Diptera) in Britain there will always be a nice concentration of records round Sheffield, which will mainly be contributed by Derek Whiteley (who does mammals too). The Welsh Borders have an amazing number of rarely recorded microfungi: Bruce Ing lives in the vicinity. Many of these 'crowd-sourced' data sets do not really display what they purport to: they are most usually maps showing where the enthusiasts live. I'm afraid that addresses are the same (at least in the UK) on OpenStreetMap, and using 'crowd-sourced' as a description of the process is rather misleading.

There were lots of other good things at the symposium, not least Jeni Tennison's clever wrap-ups of each session which suggested an almost clairvoyant anticipation of what speakers were going to say. The two messages which I thought most important came in the morning:
  • Addresses as objects, not as attributes. This came in Morten Lind's talk about the Danish address register (see his slides at SotM-FR this year here). I've actually come across this long ago in Data Warehouse schema design (for instance in a very old version of IBM's Insurance Industry Architecture). Insurance has good reasons to treat addresses separately, because addresses are associated with policies, policy holders, risks, claims, and so on. Once addresses are treated as objects, not attributes, the meme about 80% of organisational information being geo-related can be discarded.
  • Open Addresses cannot just be a PAF replacement. I was pleased to find that lots of people were in agreement with me that the Royal Mail's view of what constitutes an address is far too limited for many use-cases. In the first instance a PAF-lookalike might be the aim, but it's clear that more is needed if one wants to respond to the sense of place which ordinary people use on a daily basis. (We can be sure that the folk of Kinlochbervie don't think they live "by Lairg")
In the pub I also discovered I had a lot to learn about the British Standard for addressing.

Initiatives elsewhere in the OSM community, notably BANO and, already suggest the way forward for OSM and open addresses. We need to collate and make available any open address data outside of the OSM database per se: not least because ODbL may be too restrictive a license for many use-cases, but also because a lot of the data will not be really suitable for direct incorporation into OSM.

In conclusion: I see OpenStreetMap as a likely heavy consumer of Open Address databases rather than a major contributor to them; I see the OpenStreetMap community as a significant facilitator in the creation, maintenance and management of such open data.

No comments:

Post a Comment

Sorry, as Google seem unable to filter obvious spam I now have to moderate comments. Please be patient.