Thursday, 28 March 2013

Lamp Posts to Addresses: gold dust in the attributes

Local Government Open Data 4

I'm so used to a lot of open data being poorly attributed that I did not look at the Nottingham street light data until I'd imported it into PostGIS. Apart from lacking the asset identifier, and having redundant fields for the x and y positions of the geometry, it has two fields of fantastic value for enhancing OpenStreetMap.

Lenton Sands addresses
Lenton Sands area of Nottingham showing House Numbers from Lamp Post data overlaid on OSM

These are called LOCATION and SITE_NAME, so I didn't expect much. However, they most frequently contain, respectively, the house number of the property closest to the lamp post, AND the street name. In effect this represents about 25,000 individual house addresses for the city (perhaps a sixth of the total). The image above shows the location field overlaid on the OSM Mapnik layer for streets where I have already done some partial address mapping, often inferring additional addresses by counting the number of houses in Bing imagery from those I have surveyed.

A quick scan of the data shows that a small number of abbreviations are used in front of the house number: O/s (outside), Opp (opposite), S/o (side of), Adj. (adjacent), R/o (rear of), B/w (between), Jn or Jct or Jcn (junction), P/h (Public House), F/p  (Footpath) and W/w (Walkway). Lamp posts which are not close to a numbered or named house are often described as nth post from Blabla Road.

At least one use of this field highlights an inadequacy in the design of the system used to maintain the data:
"Jnt Homebase  *do Not Raise Additional No Light Works For This Column.  Requires New Swa Service". 
This is typical of data entry systems which fail to allow capture of important information: instead it is placed in the most prominent field which accepts free text (my own experience suggests that this is/was common with customer address data).

I did a quick experiment last night with some houses which had already been mapped from imagery. I had to split the building polygons to identify individual houses (most were semi-detached of the late Victorian or Edwardian era). The image below shows the data from street lights overlaid on the addresses that I mapped:

Addresses on Austen Avenue, Nottingham NG7 6PE, mapped from Open Data
From 5 data points I was able to map 28 addresses. I was also able to add the postcode (NG7 6PE) from Chillly's ONS Postcode slippy map. Note that I was unable to assign addresses to the first houses on the south side of the road. 13 is a house number which is often missing, so I stopped at 15. The remaining numbers could be any one of several combinations: (1,3,5,7; 7,9,11,13; 5,7,9,13; 3,5,7,9 ...) and therefore cannot be assigned without survey. If this holds generally then we might be able to identify as much as 80% of Nottingham house numbers from this data alone!

I have a lot more to say about addresses and open data to come on this blog, but this data really opens up the potential. I still have to think how to proceed. I suspect that, to use it, to its full advantage, and make the most of local mappers knowledge, requires some kind of crowd sourced effort. I feel it has the makings of an on-line mapping party!

1 comment:

  1. Thanks for drawing my attention to the Nottingham street light data. I was looking at it last night. I agree it has several potential uses. My initial thoughts were that it could be used to find more missing streets - in particular those along footpaths that are missing from the OS Locator data. The 'SITE_NAME' field seems to generally contain the name of the nearest street, so I was thinking of comparing with street names in OSM and just leaving the points where a street with the same name isn't found nearby. I could then use this in areas like Clifton Estate where I've already surveyed and added all the missing OS Locator names, but don't have time to do detailed surveys to find all the missing path names that I know must exist.