Thursday 24 March 2011

Why do lamp-posts have asset numbers?

IMG_2042aA few weeks ago there was an involved, and sometimes, heated, discussion on the main OSM mailing list about imports. Many of the comments were interesting and useful, but one particular strand has attracted my attention. A slightly grotesque paraphrase of these various messages might be "OSM is a poorly managed Computer Science project, with inadequate tools, particularly for version control."

Leaving aside that OSM is neither a project, nor managed: I'd like to focus on what seems to be a surprising mis-perception about OSM as a database.

Firstly, databases don't have to be digital or stored electronically. Phone books, card indexes, and many reference books are easily recognisable as databases. Therefore a primitive database operation is only valid if it can be applied also to non-digital media.

Secondly, and this directly follows, information stored in a database does not have to have a unique identifier (a primary key). There's nothing stopping a telephone number appearing several times in a phone book, or complete pages being duplicated.

Once data is moved onto digital media, it really helps to assign unique identifiers: data can be restructured to be stored more efficiently or be easier to change; it's easier to spot duplication or bad data. This is exactly what OSM does, nodes, ways and relations are identified internally by system generated keys.

And it's what my local council does. They maintain an asset register of street furniture (bollards, traffic signals, parking signs, parking meters) and within this application assign identifiers. These days all this information is geocoded and available in the council's GIS. This information is obviously useful for financial planning, maintenance and other activities. BUT, they've gone a step further and each asset now has it's system assigned number marked on it. WHY? Because, "replace the bulb in lamp 35621" is a lot more specific than "replace the bulb in the lamp outside 25 Main Street". There may be lamp standards opposite each other at that location, or there might be two Main Street's or there may be no number on 25 Main Street, or the house might have been demolished. However, this number DOES NOT uniquely identify the lamp-post: it might do when combined with its location data, location data of the organisation which has assigned the number and information about the status of the system used to generate the number.

What does this have to do with OSM (apart from the wonderful possibility of collecting lamp-post numbers). Well OSM is like my local council, except that our local patch is a bit bigger, and we cannot go stencilling numbers on anything we map. So we have no means to tie an OSM object to its corresponding thing in the real world.

A corollary to this is that we cannot confidently tie OSM objects to geolocated objects in other databases. There are far too many variables to even inspire confidence in fuzzy matches: when was the OSM data mapped? what sources were used? how accurately was it mapped? when was the external data mapped? how current is it? how complete is it? does it have unique identifiers? are the identifiers persistent?

So, we have a host of problems in matching data from an OSM dataset and an equivalent external dataset. These problems relate to location accuracy, temporal accuracy, matching identifiers, and accuracy of associated data.

A good example of these problems is shown by OS Locator Musical Chairs and ITO's OSM Analysis which compare OSM street name data for Great Britain with the OpenData Locator dataset from the Ordnance Survey. This is a nice clear domain with the OS Locator data being from a known source and date and from a highly reputable national mapping agency. In some areas we have enough separately sourced data in OSM to have a handle on how accurately we can match these datasets. In most areas in England about 0.5-1.0% of Locator records cannot be matched to OSM. (I am not aware of reverse statistics, but in a recent survey aimed at hunting down some 20-odd of the mis-matches I found 5 street names used for addresses which are not present in the OS dataset). Even different datasets from OSGB have enough inconsistency to prevent complete matching. And these cases are relatively easy ones.

There are also problems relating to the purpose of external datasets: cadastral data might not reflect the building outlines we would draw naturally (e.g., French & Spanish cadasters); hydrography data might be segmented for water-flow measurement (e.g., NHD); vector data might be optimised for rendering (OS OpenData VectorMapDistrict); road data might not need to be very accurate (TIGER). The imported data should be restructured to reflect what is important for OSM, not maintained in aspic for some putative update.

So those advocating data imports or having 'development forks' of OSM need to answer : how on earth can you easily relate objects between two different data sets, or even the same data set at different times. Alternatively, we could all add some stencils to our mapping toolkits, but even then we'd have to leave our armchairs.

Postscript: the council are busy replacing all the local lamp posts, wonder what number they'll put on the new ones.

Tuesday 15 March 2011

Nottingham OSM Pub Meet March 8th

Nottingham pub meet-up : 2528a

After Mark Iliffe & I met at State of the Map 2010 in Girona, we talked about arranging an event for OSM contributors in the East Midlands to meet-up. As Mark was then Leicester-based we'd been thinking of a mapping party. However, with Mark doing a PhD in Nottingham it seemed that we could start with something a little less formal. A pub meeting seemed a good chance to meet some fellow mappers face-to-face and talk about the activities which interested people.

At the outset I hoped we might at get at least five people, that I'd get to meet another local mapper, and that we'd make contact with other interested parties.

I was immediately reassured when I saw that Shaun was coming. Neatly, he was my first ever OSM contact point when I first attended a London pub meeting almost exactly two years ago. Joining us were: local mappers Andy (SomeoneElse), Kev (kevjs1982) and Laura; David came along representing Pedals, the local cycle advocacy group; and Marcus came from Nottinghack, the Nottingham Hackspace. So 8 of us altogether was a nice turnout. It was nice to make contact with Pedals, and a wonderful surprise to discover about the hackspace.

Subjects which came up in the course of the evening:
  • Inability of the automated voice in Garmin nuvi to cope with "Huthwaite".
  • DINTY, monads, tetrads and octads (dont ask).
  • Bus stop of footpath in Erewash...The lonely bus stop on Erewash Field (see image at right, discovered by Kev)
  • Differences between Lancashire towns with lots of rows of terraced housing and Nottingham's huge legacy of inter-war ditto
  • Co-ordinating mapping activity, specifically for crisis mapping. I alluded to the difficulties involved, and the general point that usually mapping activity tends to self-correct. Mappers are both ants (self-organising) and cats (not easy to herd).
  • Pubs which have gone (The Gate at Jacksdale).
  • Marcus asked why the map round the Borlase Warren had lots of buildings but few Points of Interest. We agreed that adding individual shops was surprising labour intensive and error prone. Kev, who has mapped all the shops on Central Avenue, West Bridgford, said it was pretty easy to accidentally miss one. (Also, I had surveyed these but have not added them to the map).
  • David asked the active mappers if we ever used OSM. He got an emphatic yes, Andy showing that his Garmin showed pubs with a real_ale=yes tag.
My OSM logo muffins were rejected! Largely (I hope) because most people had either eaten pancakes before hand or had ordered one of the absurdly generous portions from Borlase Warren's menu. At least there was some semblance of cake at the meeting.

A straw poll suggested that we should do it again, and so I'll be scheduling a date for April real soon now.

Any OpenStreetMap is, of course, an opportunity to add to the map, so I'm glad to report, that after an extensive survey, Andy (SomeoneElse) summarised our conclusions.

PS. I went to Nottinghack's weekly opening evening the following day and was able to add them to OSM.

Wednesday 9 March 2011

Garmin overlays of GNIS names

I've just been reminded of something I did last summer during the Pakistan Flooding: creating a garmin overlay of names from GNIS.

I'd been trying to enter names from old US DMA (military) maps and was getting frustrated by the difficulty of being sure of names. Plus with landsat imagery it was possible to identify villages, but I wanted to put names to them. I downloaded the GNIS names for Pakistan, originally with the aim of creating a separate layer in JOSM for assigning names. This required converting the names into OSM format (I have no skills suitable for building a tile or other service). In the end it was still difficult to reconcile names to residential areas (mainly GNIS quality issues, I suspect), but I did create a transparent 'names' overlay for garmins from the data using mkgmap.

As it might be useful again, I thought I'd quickly describe the process. Essentially there were seven steps, each simple on its own, but involving a bit of trial and error before I got there:
  1. Create an image table of GNIS format in a PostGIS database. By image table I mean one that copies the columns of the source data one-for-one, often using character columns even if the data is numeric to avoid losing data. type conversion can be carried out as a post-processing step.
  2. Load the downloaded GNIS data with a simple COPY statement.
  3. Add a POINT geometry column to the table. Populate this from the lat/lon in the data
  4. Create a 'simple' OSM schema in the same database.
  5. Populate the nodes table of the schema, either directly from the image table or from a table created to post-process the data.
  6. Extract the data to an OSM XML file using Osmosis.
  7. Build a garmin file using mkgmap.
I'll go through each of these steps in more detail below.

The starting point is to have a PostgreSQL db server running, with PostGIS installed. I have a template database which has not only PostGIS, but also hstore, and the OSM simple schema already present. I can then immediately create a new database specifically for a given OSM task. What I describe works for Postgres DB 8.4, PostGIS 1.5 on Windows with Osmosis 0.38 and mkgmap r1443. Small modifications may be needed for other software versions.

Image Table:
-- Table: pakistan_gnis_names

DROP TABLE pakistan_gnis_names;

CREATE TABLE pakistan_gnis_names
region_font_code numeric(1),
unique_feature_id numeric(38),
unique_name_id numeric(38),
latitude numeric(10,8),
longitude numeric(11,8),
latitude_ddmmss character(6),
longitude_ddmmss character(7),
military_grid_ref character(15),
jog_ref character(7),
feature_classification character(1),
feature_fesignation_code character(5),
populated_Place_Class numeric(1),
primary_country_code character(2),
first_order_admin_code character(2),
population numeric(38),
elevation numeric(38),
secondary_country_code character varying(128),
name_type character(2),
language_code character(3),
short_form character varying(128),
sort_name_ro character varying(255),
full_name_ro character varying(255),
full_name_nd_ro character varying(255),
sort_name_rg character varying(255),
full_name_rg character varying(255),
full_name_nd_rg character varying(255),
note text,
modified_date date
ALTER TABLE pakistan_gnis_names OWNER TO jrc;

Add & Populate Geometry Columns
(in this case in a table just containing a subset of names):
drop table pk_gnis_p_name;
create table pk_gnis_p_name as
select * from pakistan_gnis_names
where feature_classification ='P';
alter table pk_gnis_p_name drop wgs_geom;
select AddGeometryColumn('pk_gnis_p_name','wgs_geom',4326,'POINT',2);
select populate_geometry_columns()
update pk_gnis_p_name
set wgs_geom = ST_geometryFROMTEXT(
'POINT('|| longitude || ' ' || latitude || ')'

Create OSM 'Simple Schema':
This needs to be compatible with the version of Osmosis being used. Normally the DDL are in one of the osmosis distribution sub-directories. If doing in a dedicated database just use the default names, otherwise stick a prefix on the standard table & index names. I used 'pk' for Pakistan.

Populate the NODE table

SQL to be added. This is the only table needing populating so a single simple statement is needed, but you do need to use hstore syntax these days. This is a single insert into the nodes table, but as several columns are NOT NULL we need to ensure that we create suitable values. Fortunately, referential integrity is not enforced across tables so we don't have to worry about ensuring the reference values are populated (although it's not a bad idea). Furthermore, the data is not being merged with any OSM data. For gnis data there is already a unique identifier column so we can use this to identify the node. Version, changeset and user can all be given a standard default value (0, -1 are most commonly used).

INSERT INTO nodes (id, "version", user_id, changeset_id, tstamp, tags, geom)
SELECT unique_feature_id as id
, 0 as "version"
, 0 as user_id
, 0 as changeset_id
, clock_timestamp as tstamp
, 'name' => full_name_ro as tags
, wgs_geom as geom
FROM pk_gnis_names

Extract with osmosis:
This is a standard extract using --rp and --wx flags. See Detailed usage.

Create a specific style for mkgmap:
I created a special style for mkmap called gnis_names. A zip of the subdirectory is here.

I'll add a bit more about the logic later.

Build the Garmin IMG file using mkgmap:
This is the command I used to build the overlay. Additional work is required to get this into MapSource or integrate it into a gmapsupp.img file. I used MapSourceToolkit to change the registry settings on windows to see it in mapsource and thereby transfer it to my garmin. Usual stuff applies about family/product & file names.

java -ea -Xmx1536M -jar mkgmap.jar --mapname=88031021
--style-file=resources\styles\gnis_points --transparent --description="GNIS Pop Places PK"
--series-name="GNIS Pop Places PK" --input-file=pk_gnis_names.osm --product-id=88 --family-id=1 --product-name="PK GNIS" --overview-mapname=88031020 --tdbfile

I'll add a bit more to this later & tidy it up too.

Former Pubs

Dover Castle : 1152c

An inevitable problem arises when mapping in Britain these days. How does one map dead pubs?

Pubs have been dying at an increasing tempo ever since the Beer Orders of 1989. Some of this is due to the rapacity of the 'pubcos', but much is due to cheap supermarket booze, social change in inner-city areas, and a broader range of alternative means of entertainment. Although dead pubs exist everywhere, it's the ones in inner-city areas, or suburban estates, which I notice most frequently.

Some pubs are 'zombies', having fallen into a 'close-reopen-fail-close' cycle (phrase nicked from Richard on IRC). These can often be identified by the 'run your own pub' banners outside. There are also pubs which are only slightly more alive: often remarkably difficult to work out their actual status when walking past during the day. Just occasionally one of this class of pub gets resurrected. I tend to tag all such pubs as ordinary pubs, but if currently closed will add "(closed)" after the name. I do this for two reasons: an amazing number do show a triumph of hope over experience, and re-open; and, secondly, pubs are important navigational landmarks. The Target roundabout at Northolt on the A40 is a good example of the latter, even though the pub closed in 1986 when it was converted to a burger bar.

Once the pub sign has gone and all the windows are boarded up then I'm willing to accordingly change the tagging to amenity=dead_pub or similar. Usually such pubs either change use, or get demolished. Local to Nottingham they've turned into supermarkets (two Tesco Express stores : the Jolly Higglers, 17/21st Lancers), Indian restaurants (The Poachers Tavern now the Gurkha Kitchen) and mosques (Le Grand, Hyson Green , and possibly The Boulevard). The A610, Nuthall Road, has a large number of recently demolished pubs, typically they are replaced by housing. Other recent housing developments include : Cremorne Drive is a private gated road in The Meadows on the site of the Cremorne Hotel; Beaumont Square has been built on the site of The Wollaton Arms.

Robin Hood, Pinkneys Green : 17402Elsewhere, the Stonor Arms in Stonor has been closed since the early 2000s, and despite recentish press coverage didn't look likely to open again last summer, but The Robin Hood in Pinkneys Green sprang into life again in 2007 after being boarded-up for 6 years. So its remarkably difficult to be certain about dead pubs.

A very few remain intact as buildings but with a changed use, like The Dover Castle (photo above), now let as student accommodation. I don't know when this closed, but it was already there by the time of the 1891 census. Familiarity with an area, or close inspection of the building will reveal the former use: then it's worth recording them as building=pub.

P.S. Chris Richards (riffdesign) has some nice atmospheric photos of Radford between '70s and early 90s, including several dead pubs.

Thursday 3 March 2011

A not very mysterious Mystery Walk

On Bramcote Ridge : 2461a

Given that mapping takes me to many local places I signed up with the Ramblers Association to do a 'mystery walk'. Of course, these are so-named by analogy with 'mystery shoppers', but I find the description engaging in an oxymoronic way.

I registered ages ago, but only received my starting point co-ordinates last week: SK 530 380 (drop 4 of the digits and you'll see the origin on my OSM username - a geocode). Unfortunately, this is right on the edge of Nottingham, and part of the idea is to stay within the boundary of a single highway authority (the local government body responsible for footpaths), so my route was already constrained. A second constraint was that towards the city is a large contiguous area of parkland, consisting of the main campus of Nottingham University (University Park) and two public parks (Highfields and Wollaton Park). Together these provide a large area of pleasant and varied walking: but they are not public highways, so there wasn't much point reporting path quality and accessibility there.

I really had only one viable destination: a small remnant of 'countryside' between Nottingham city and the suburb of Bramcote Hills. This is a ridge of Sherwood Formation sandstone (Permo-Triassic age), with remnants of the characteristic heathland flora of Sherwood Forest. I know the area, but not particularly well: it is too far for casual naturalising and not so hugely different in terms of plants and insects from places closer to home. For those in the immediate locality Bramcote Ridge is highly valued with an active friends group. The part in Broxtowe district is largely managed as two separate Nature Reserves: Alexandrina Plantation and Sandy Lane Open Space. I hadn't realised until yesterday, but the part of the ridge in Nottingham does not seem to have any particular designation.

Route plotted using Maperitive

The whole area is accessible from many entry points, but many of these are obscure little paths at the end of dead-end streets. It is more-or-less impossible to work out access from the OS Explorer map: fortunately the area is pretty well mapped elsewhere! Using OSM-based maps I plotted a route which would take me onto Bramcote Ridge and (largely) stay within the City boundary. Of course I added a couple of mapping desiderata to my route: a couple of postboxes and addresses. I also wanted to see if I could find any waymarks for the notoriously elusive Robin Hood Way, one of two long distance paths promoted by the county council.

Robin Hood Way marker : 2522dMy route took me along the Robin Hood Way from close to Wollaton Park, but I saw no signs of waymarking until I reached the top of Bramcote Ridge, just as the path left Nottingham. I did pass a surprising large number of side roads which enabled me to add a large number of address interpolation ways. Very high quality Bing imagery means that it's usually possible to extrapolate and assign addresses for the whole street. Although the paths on to the ridge are not easy to find they are heavily used : and anyway most are in OSM.

Nottingham Boundary Marker : 2473bBramcote Ridge OOC OSI did find a 1933 Nottingham City Boundary Post. The city boundaries were extended in 1933 to incorporate some surrounding villages which had already been suburbanised. Boundaries with other places were also adjusted. Sseveral are shown on the provisional edition 1:25000 map of SK53, so I was on the lookout. These seem to have once been quite common: I've put a couple of others on OSM: they're an antidote to the often made claim that boundaries cannot be surveyed.

The only negative to report back to the RA is a complete dearth of waymarking, but this might be expected in a city. Nottingham, like many cities, did not have to maintain a definitive map of Public Rights of Way, and is only doing so now, because of the cut-off date of 2026 in the CRoW legislation. And, I only managed about a quarter of the 2 mile distance off residential roads.

Overall, I learnt nothing I didn't already know: The Robin Hood way is badly waymarked (or rather it's original waymarks have not been maintained); OS mid-scale mapping is not much use for finding off-road walking routes in cities; and OSM provides more useful cartography and GPS maps for this kind of walk. Urban local authorities might be relucant to spend money on waymarking for several reasons (vandalism, fly-tipping and lack of money come to mind). An alternative would be to providing focussed mapping for walkers. They already do this for cyclists, and I know a good cheap source of information.