Monday, 18 January 2016

UK Open Data and Buildings in OpenStreetMap

I've finally (after 8 months) got around to looking at the OpenMap Local buildings. This new dataset was launched at the first OpenDataCamp, and I've had the SU 100 kilometre square data on the PC since then (it's contains Southampton, where Ordnance Survey are based). I use Meridian 2 OS Open Data regularly and extensively, but these days don't make much use of the larger scale vector data.

Nottingham City Centre: OSM/OSGB Building Comparison
Comparison of Building polygons for Central Nottingham
OSM has more detail and does not merge discrete buildings.
Contains Ordnance Survey data (c) copyright and database right 2015, OSM data (c) OpenStreetMap contributors 2015, Lidar data from Environemnt Agency under OGL 3.0, (c) Crown Copyright and database right 2015. Image CC-BY-SA, the author.
I needed them for something else which caused me to download the SK data. Co-incidentally Christian Ledermann had asked on talk-gb about using this data to add buildings to OpenStreetMap for Newark-on-Trent. A little earlier the Environment Agency had released Lidar data for England, and this is also useful as input for mapping buildings.

OpenMap Local

Apart from the area I originally needed which were in SK41 (no buildings in OSM), I've also looked at areas which I know much better & compared some selected areas where we have good building coverage around Nottingham. The comparisons I made are shown visually, with my main observations summarised at the end. Note that comparisons have not been made on any systematic basis.

uon_univertiy_park
University Park, University of Nottingham
an area of predominantly large academic buildings.
OpenStreetMap and OS OpenMap are largely in agreement: the minor differences applying to newer buildings which post-date the Bing imagery.
Contains Ordnance Survey data (c) copyright and database right 2015, OSM data (c) OpenStreetMap contributors 2015, Lidar data from Environemnt Agency under OGL 3.0, (c) Crown Copyright and database right 2015. Aerial Imagery via Bing, (c) as in image. Image CC-BY-SA, the author.

uon_science_city_buildings
The Science City part of the University Park campus.
A new large lecture theatre block is not present in OpenMap data, and the outline of the building top centre (Tower Building) is over-simplified.
Contains Ordnance Survey data (c) copyright and database right 2015, OSM data (c) OpenStreetMap contributors 2015, Lidar data from Environemnt Agency under OGL 3.0, (c) Crown Copyright and database right 2015. Image CC-BY-SA, the author.


newark_buildings2
Central Newark. OpenMap vs Lidar.
Many instances of building merging & over-simplification are apparent here, notably with the outline of the parish church.
Contains Ordnance Survey data (c) copyright and database right 2015, Lidar data from Environemnt Agency under OGL 3.0, (c) Crown Copyright and database right 2015. Image CC-BY-SA, the author.

newark_buildings1
Newark-on-Trent, residential areas, Showing inconistency in size for similar houses, and merging of terraced housing.
Contains Ordnance Survey data (c) copyright and database right 2015, OSM data (c) OpenStreetMap contributors 2015, Lidar data from Environemnt Agency under OGL 3.0, (c) Crown Copyright and database right 2015. Image CC-BY-SA, the author.
I have not made systematic comparisons, but these are my main observations (in brackets the 1km grid square where I've noted any particular issue):
  • Best for larger buildings. The data seem much more reliable (actually matching building footprints fairly well) for larger buildings. Even for large detached houses I would regard the data as unreliable: on our road of 40 detached houses, at least 16 are represented as terraces (SK5439). Similar artefacts occur in other areas with detached houses: apparently caused when a garage is close to both houses. Smaller houses are inherently simplified: no better than drawing one in JOSM and then copying the outline in fact.
  • Building fusion. This is particularly clearly seen in the city centre image, where a whole block of buildings has been simplified to a single building (centre of image), but also occurs in suburban housing (see above).
  • Inconsistency in geometry simplification. This is most noticeable in the city centre. (SK5739). For instance compare the OSM and the OpenMap Local outlines for St Peter's Church (bottom right in map above). In OpenMap Local the church is just shown as a rectangle, whereas in practice it is more complex. Modern buildings on the Jubilee Campus of Nottingham University are generally shown with more detail.
  • Inconsistency in building size. In SK5439 there are a very large number of houses which were identical when built. However, in the OpenMap Local they are often of different sizes. (This is also probably true of OSM, if buildings have not been created by duplication).
  • Voids. Gaps between closely packed buildings in the city centre appear slightly arbitrary in both placing and whether such a void exists or not.
  • Some selection inconsistency with small size buildings. Only 2 garages are shown in an area of around 500 houses. With OSM the figure is nearer 200+. (SK5439)
  • Demolished buildings. Whilst I would not expect the data to show the building demolished in the past month, I would expect it to not show one demolished 2 years ago, and I would certainly expect it not to show one demolished in 1970 (although MasterMap shows this too). (SK5439)
  • Better locational accuracy. If using the full transform it may be useful to take advantage of the better locational accuracy of this data. In the main OSM buildings are rarely more than 3 m displaced from the OS OpenMap Local. (SK5439) In general the more recently mapped buildings in Nottingham city centre have better locational accuracy than this (SK5739).
Taken together, my use of this directly within OSM would be along the following lines :
  • Selective transfer of larger buildings (schools, offices, public buildings, factories, warehouses, larger shops) on a case-by-case basis from a shapefile to a JOSM editing layer, or to Potlatch 2. Some minor refinement will probably be needed (for instance a university building here has long narrow courtyards which act as light wells which are not shown in OpenMap Local.
  • Only use it for houses and similar when shapes are very simple and everything has been double checked, at the very least, against aerial imagery. For simple shapes it's as quick to draw & copy in JOSM anyway. A similar principle holds for more complex building shapes on modern estates, where one building can be cloned.
  • Watch out for demolished buildings. This requires not just checking against Bing/MapBox imagery, but some local knowledge for sense checking.

Environment Agency Lidar Data

Another source of building data is the recently released Environment Agency Lidar data. This does not cover the whole country, and in many places may only be at 1 or 2 m resolution. It may also be quite old. However, because it does not suffer from parallax artefacts it can be used in conjunction with both aerial imagery (whether from Bing, MapBox or more local sources) and OS OpenData. I have provided examples from Nottingham, Newark, and Melton Mowbray of this data, combined with one or more of OSM buildings data, OS OpenMap or Bing aerial imagery.

Melton Mowbray. EA Lidar DSM (1m) overlaid on OSM.
The Lidar data was used to refine the OSM building outlines
which originally were traced from OS StreetView as block-sized polygons.
(see commentary)
Melton Mowbray illustrates many of the benefits of Lidar data. It is a fairly typical country town, with many of the buildings in the town centre ranging in age from 10 to 500 years old. Many extend back from the street in a series of outbuildings (e.g., stables) which have eventually been incorporated into the main building, but this process leaves lots of small courtyards, service yards, etc which are more or less impossible to discern on aerial imagery.
Butter Cross on Market Place, Melton Mowbray
Butter Cross in Market Place, Melton Mowbray
Despite the different styles & ages of the buildings, several have long ranges at the rear.
By doing a street-level ground survey one can identify which buildings are distinct on the street front. Lidar than helps to construct a building outline which is consistent with this. I surveyed the cetre of Melton in September, and this was the first place where I used Lidar data to aid in the interpretation of aerial imagery. In this case I find it essential to have adequate street level pictures to be able to relate to the aerial imagery: most useful are the presence and distribution of chimneys: because they throw shadows they are often visible even on poor quality imagery.

The Lidar data also allows one to do some other things: notably find building heights. I've done this for a 1980's estate on the edge of Maidenhead: particularly easy as the residential buildings fall into a small number of categories: bungalows, two-storey-houses & maisonettes (purpose built flats in a house-like structure.

A 1980s housing estate with building heights mapped from English Environment Agency LIDAR Open Data. Buildings fall into 3 height categories: bungalows (green: approx 4m high), 2-storey houses of various kinds (blue: approx 6 m high), and maisonettes (condominiums) which are about 7 m high (red). Heights were calculated in m, so the values represent minimum heights of the highest part of the building, which is nearly always the gable line.
Outpur via Overpass Turbo, styled with MapCSS.
There are many other useful blog posts about using this Lidar data, both specifically for OSM, but also generally. See posts by Chris Hill ("More Lidar Goodness" and "Building Heights") and Ed Loach for some of the specifics, and the write-up on the wiki. A nice post and map (v. slow in my browser) showing building heights in London on OpenMap Local may also be of interest. HousePrices has processed all the Lidar data from EA and Natural Resources Wales  as a hillshaded slippy map which is useful to look at what is available. Slightly unfortunately the map is in OSGB projection (ESPG:27700) and is not shown with other slippy maps which would make it a bit easier to locate oneself.

What kind of building data should be added to OSM?

From past experience single building outlines traced from OS StreetView, turn out to represent tens of buildings on the ground. Such simplified outlines just makes the work of splitting the buildings properly quite a lot harder. This can be particularly bad in town/city centres.

Usually if adding detail of POIs and addresses it is important to have individual buildings mapped: this makes it much easier to correlate photos to roofline features such as chimneys, gables etc. A single very simple outline may be OK, because for more detailed mapping it should just be a question of deleting the original outline. However, the question must be asked, as to what purpose such an outline fulfils on OSM, when the source data can be readily combined with OSM data for downstream consumption.
Granby Street, Leicester (geograph 2296099)
Granby Street, Leicester.
The multiple buildings shown here are represented in OSM as single buildings for each block
(imported from OS StreetView Open Data).
CC-BY-SA   © Copyright Malc McDonald and licensed for reuse under this Creative Commons Licence.
I think the fundamental question about straight imports into OpenStreetMap should be "Will it make life easier or harder for subsequent mappers?".

If the work involved refining a building outline takes longer than re-drawing the building then I doubt if its worth importing the building at all. This is particularly true if the outline is actually of multiple buildings. This is why large building outlines are most valuable: they are generally pretty good compared with what an initial hand-traced outline might look like, and they lend themselves better to stepwise refinement. One group of buildings I find particularly tedious to do well are schools which tend to be a sprawling mass of interconnected buildings. Starting with a decent polygon with orthogonalised angles make adding such detail much easier. The current quarterly project for UK-based mappers might be the time to test this.

Of course it may be that adding buildings assists in some other mapping goal. I've already mentioned that details of buildings are very useful for addresses. However OpenMap Local lacks the detail in precisely the areas where it would be most useful (city & town centres). For suburban or inner-city housing similar polygons can be created as quickly in OSM editors (notably in JOSM, by duplicating existing buildings or using the Terracer (or even UberTerracer) plugins.

The other thing which many people want is rendered maps largely derived from OSM, but showing more buildings. In practice, because many mappers do not have the know-how, wherewithal or time to create such a rendered view, they tend to want to import buildings. Historically, OSM tools for importing data are often much easier to use than ways to incorporate the same data and OSM data to  render maps and make them accessible on the web. Perhaps we need to do more to help people in the latter task: which is now getting more complicated again with the move to vector tiles (at least outwith use of MapBox Studio), and TileMill's effective status of being a legacy application.

Summary 

Sadly, although the new building outlines are better than what preceded them, in most cases they don't offer a decent route for iterative refinement with OpenStreetMap.

This absence of a simple way to improve building outlines means that ideally people wishing to use this data would merge it with OSM data outside of OSM. I do recognise this is often too much work, or too big a learning curve for many, and consequently there will always be a desire to add buildings to OSM because many people are much more comfortable with consuming only OSM data for their purposes.

Existing tools for drawing buildings in OSM are pretty powerful & getting more powerful all the time. Many of us, and I include myself in this group, are unaware of the full extent of these utilities. See bdiscoe's diary post about mass adjustment of circular buildings (huts) for some insights.

Wednesday, 2 December 2015

How accurately have Townlands in Northern Ireland been mapped?

From time-to-time newly released Open Data provides a nice opportunity to check OpenStreetMap for its accuracy in all its forms (see Hakaly (2008) for a breakdown of what this can mean).

Coastal Townlands, Cos. Derry & Antrim
Coastal Townlands, Counties Derry/Londonderry and Antrim.
Boundary lines see below. The deeper the colour of the area, the greater discrepancy in the area of the OSM polygon and the OSNI one. The pale base colour represents a divergence of under 2%. Townlands on the coast and on the UK/Ireland border seem to  be most likely to diverge in size. The small cluster centre right is caused by different ways of handling townlands which cross a Civil Parish boundary (OSM & the original source GSGS 3906 split these, the OSNI data does not).
We have known for a while that both the Ordnance Survey of Northern Ireland and the Ordnance Survey of Ireland were planning OpenData releases. When they came it was all in a rush. For now the hard work starts of checking license conditions for suitability for use in OSM and other places, as well as then working out what is really useful. However, because the townland boundaries of Northern Ireland are complete, it was an ideal opportunity to look at accuracy.

View along N side of MacGilligan Peninsula towards Inishown from Umbra

My reasons for doing this are not just pure interest. The usefulness of the Irish Vice County boundaries depends of their positional accuracy. Earlier my prediction was that such boundaries ought to be within 10 metres of their true location on the ground where they were based on townland boundaries, but this was largely based on experience with other OSM data rather than an objective statement. Thus investigating the accuracy using an independent data set provides an excellent way of testing this statement. The tests need to be done now, because (as we shall see) the nature of OSM is to fix issues spotted very quickly, and thus datasets become loosely coupled.

I adopted two approaches:
  1. A straight comparison of areas (or their ratios).
  2. Using a series of buffered boundaries from one source (OSNI) and seeing what proportion of the other source (OSM) was included in each buffer.
To choose which townlands to compare I followed a suggestion of Rory McCann and for each OSM townland selected the one which shared the most area in common from the OSNI data set. (I have also done it on matching names for a smaller set of data & get similar results). Note that I am comparing townland with townland, not boundary segment with boundary segment. This means that each boundary segment (other than coastal, lacustrine or riverine ones) will be included twice.

umbra_townland_cf
Buffering approach to investigating boundary accuracy.
Demonstrated with Umbra townland in County Derry/Londonderry.
This is predominantly coastal sand dunes, with a small river running along its S boundary.
Northern Ireland Townlands OSNI comparison
Northern Ireland using the same colouring.
At this scale very few boundary mismatches are apparent.
The buffering approach is based on that described by Hakaly (2008). I used buffers of 5, 10, 15 and 20 m, and then clipped the initial OSM way be each in turn.  On the scale of the whole country it is clear that most boundaries match closely. This is confirmed by checking what proportion of the boundaries fall into each buffer class: over 80% are within 5m, over 90% within 10m and nearly 95% within 20m.


Closer inspection (as with the Umbra) shows much of the discrepancy to be present along the coast. This is not surprising, coastlines on OSM were originally derived automatically, and even when refined by hand are unlikely to accord with Mean High Water (MHW). Certainly, for my purposes, it is merely important that the OSM coastlines do not stray above MHW.

NI Townlands, all boundaries within 5 m of OSNI
OSM townland Boundaries within 5m of OSNI data
The analysis described so far focusses on positional accuracy. Looking at areas highlights a range of other accuracy issues.

townlands_ni_cf9
Area comparison. Townlands are coloured according to absolute variance of ratio of areas from 1.
The redder they are the further the ratio is from 1.
Area discrepancies of over, say 5%, may be the result of any of the following:
  • Boundary discrepancy (such as coastlines). Mainly caused by coastlines, or difficulty of delineating some boundary feature, such as the course of the Umbra river above) 
  • Erroneous interpretation of the boundary on old maps causing selection of the wrong feature. This transfers land from one townland to another, therefore these should cluster. 
  • Missing townlands. When a single townland has been created without noticing one or more others inside it (Town Parks townland at Ballymoney is an example). 
  • Different treatment of townlands bisected by a Civil Parish. See caption of first image above. Incorrect tagging. 
  • Higher level administrative units having tags appropriate to a townland. I've noted two cases of this one of which was Ballyphilip CP on the Ards peninsula in County Down. 
  • Islands. Some offshore islands appear to be missing from the OSNI data (see The Skerries N of Portrush)
We've already caught a few examples in each of these classes through this analysis, and no doubt will find a few more. I have not yet investigated the very apparent discrepancy along the borders.

To conclude, townland boundaries show exactly the kind of positional accuracy we expected (or perhaps hoped). Perhaps 1% of the total data (90-100 townlands from about 9000) may need some form of correction. I'm biased, but this seems pretty good, for a project principally relying on rectified photo-reduced maps from 1939! It's also worth remembering, that unlike road comparisons, there is no widely available sensor data (ie GPS tracks/point) to help boundary alignments.

When time permits I'll extend this to include OSI Open Data too. A big thanks to both organisations for releasing their Open Data. OSNI staff have been contributors to OSM for a while: they host Missing Maps lunchtime sessions in their offices.

How accurately have Townlands in Northern Ireland been mapped?

From time-to-time newly released Open Data provides a nice opportunity to check OpenStreetMap for its accuracy in all its forms (see Hakaly (2008) for a breakdown of what this can mean).

Coastal Townlands, Cos. Derry & Antrim
Coastal Townlands, Counties Derry/Londonderry and Antrim.
Boundary lines see below. The deeper the colour of the area, the greater discrepancy in the area of the OSM polygon and the OSNI one. The pale base colour represents a divergence of under 2%. Townlands on the coast and on the UK/Ireland border seem to  be most likely to diverge in size. The small cluster centre right is caused by different ways of handling townlands which cross a Civil Parish boundary (OSM & the original source GSGS 3906 split these, the OSNI data does not).
We have known for a while that both the Ordnance Survey of Northern Ireland and the Ordnance Survey of Ireland were planning OpenData releases. When they came it was all in a rush. For now the hard work starts of checking license conditions for suitability for use in OSM and other places, as well as then working out what is really useful. However, because the townland boundaries of Northern Ireland are complete, it was an ideal opportunity to look at accuracy.

View along N side of MacGilligan Peninsula towards Inishown from Umbra

My reasons for doing this are not just pure interest. The usefulness of the Irish Vice County boundaries depends of their positional accuracy. Earlier my prediction was that such boundaries ought to be within 10 metres of their true location on the ground where they were based on townland boundaries, but this was largely based on experience with other OSM data rather than an objective statement. Thus investigating the accuracy using an independent data set provides an excellent way of testing this statement. The tests need to be done now, because (as we shall see) the nature of OSM is to fix issues spotted very quickly, and thus datasets become loosely coupled.

I adopted two approaches:
  1. A straight comparison of areas (or their ratios).
  2. Using a series of buffered boundaries from one source (OSNI) and seeing what proportion of the other source (OSM) was included in each buffer.
To choose which townlands to compare I followed a suggestion of Rory McCann and for each OSM townland selected the one which shared the most area in common from the OSNI data set. (I have also done it on matching names for a smaller set of data & get similar results). Note that I am comparing townland with townland, not boundary segment with boundary segment. This means that each boundary segment (other than coastal, lacustrine or riverine ones) will be included twice.

umbra_townland_cf
Buffering approach to investigating boundary accuracy.
Demonstrated with Umbra townland in County Derry/Londonderry.
This is predominantly coastal sand dunes, with a small river running along its S boundary.
Northern Ireland Townlands OSNI comparison
Northern Ireland using the same colouring.
At this scale very few boundary mismatches are apparent.
The buffering approach is based on that described by Hakaly (2008). I used buffers of 5, 10, 15 and 20 m, and then clipped the initial OSM way be each in turn.  On the scale of the whole country it is clear that most boundaries match closely. This is confirmed by checking what proportion of the boundaries fall into each buffer class: over 80% are within 5m, over 90% within 10m and nearly 95% within 20m.


Closer inspection (as with the Umbra) shows much of the discrepancy to be present along the coast. This is not surprising, coastlines on OSM were originally derived automatically, and even when refined by hand are unlikely to accord with Mean High Water (MHW). Certainly, for my purposes, it is merely important that the OSM coastlines do not stray above MHW.

NI Townlands, all boundaries within 5 m of OSNI
OSM townland Boundaries within 5m of OSNI data
The analysis described so far focusses on positional accuracy. Looking at areas highlights a range of other accuracy issues.

townlands_ni_cf9
Area comparison. Townlands are coloured according to absolute variance of ratio of areas from 1.
The redder they are the further the ratio is from 1.
Area discrepancies of over, say 5%, may be the result of any of the following:
  • Boundary discrepancy (such as coastlines). Mainly caused by coastlines, or difficulty of delineating some boundary feature, such as the course of the Umbra river above) 
  • Erroneous interpretation of the boundary on old maps causing selection of the wrong feature. This transfers land from one townland to another, therefore these should cluster. 
  • Missing townlands. When a single townland has been created without noticing one or more others inside it (Town Parks townland at Ballymoney is an example). 
  • Different treatment of townlands bisected by a Civil Parish. See caption of first image above. Incorrect tagging. 
  • Higher level administrative units having tags appropriate to a townland. I've noted two cases of this one of which was Ballyphilip CP on the Ards peninsula in County Down. 
  • Islands. Some offshore islands appear to be missing from the OSNI data (see The Skerries N of Portrush)
We've already caught a few examples in each of these classes through this analysis, and no doubt will find a few more. I have not yet investigated the very apparent discrepancy along the borders.

To conclude, townland boundaries show exactly the kind of positional accuracy we expected (or perhaps hoped). Perhaps 1% of the total data (90-100 townlands from about 9000) may need some form of correction. I'm biased, but this seems pretty good, for a project principally relying on rectified photo-reduced maps from 1939! It's also worth remembering, that unlike road comparisons, there is no widely available sensor data (ie GPS tracks/point) to help boundary alignments.

When time permits I'll extend this to include OSI Open Data too. A big thanks to both organisations for releasing their Open Data. OSNI staff have been contributors to OSM for a while: they host Missing Maps lunchtime sessions in their offices.

Sunday, 15 November 2015

Urban Areas 4 : Derivation from OpenStreetMap using road density

Another variation on the theme from the last post: this time looking for some measure of road density.


Butler Co, PA: derived Urban Areas
Comparison of Urban Areas derived using "block method" and gridded road density.
Only grid squares with over 500 m of road included.
The area shown is around Butler, Butler Co, PA

The easiest way is to sum road lengths in individual cells. The cells have to be quite small (say 250 metre square) to achieve the resolution desired. I've excluded link roads and motorway & trunk highway classed in this calculation.


Saturday, 14 November 2015

Urban Areas 3 : derivation from OSM using residential blocks

FromCoL 8234828991
View S from the Cathedral of Learning in Oakland, Pittsburgh,
showing some urban areas used as tests in this post.
The incised valley of the Monogahela in the background contained railways and steel works. The plateau beyond has residential suburbs of Pittsburgh. To the left foreground are the woods and ravines of Schenley Park, with a residential area beyond. Source: Zack Weinberg via Wikimedia Commons CC-BY-SA

One of the obvious features of the highway network for the USA on OpenStreetMap is that road density is much higher in built-up areas. I started looking at how to measure this, when I recalled a method for identifying city blocks introduced to me by a Brazilian user of OpenStreetMap data.

butler_co_urban_blocks
Residential Areas for Butler Co, Pennsylvania, identified with the block method
from OpenStreetMap data. Orange line outlines Butler County.
My idea was simple, a greater road density implies smaller areas for the polygons enclosed by a set of roads. By choosing some maximum polygon size, one should be able to pick out urban areas.

The method itself is also really quite simple:
  • Take the main road network for some area and make a union of it (which will be a MULTILINESTRING).
  • Polygonize this data, and decompose to individual polygons.
In Lucas' implementation the first step is done by municipal areas. I wanted to try the approach for a whole state without using administrative area data. I therefore once again turned to my trusty standby of using a gridded method.

Thursday, 12 November 2015

Urban Areas 2 : Derivation from OpenStreetMap using Residential Roads

Street corner, Retiro, Buenos Aires
(Libertad/ Juncal)
CC-BY-SA, the author
Following on from my last post I have now been looking in more detail at how one might start using OpenStreetMap (OSM) to create a global dataset of Urban Areas. As OSM does not have any widely used notation for urban areas I have been looking at several ways in which other OSM data can be used to identify such areas prospectively. In this post I look at the use of residential roads (and I'm not the first to do so). Later posts will look at other techniques.

ar_ba_urban2
Buenos Aires and hinterland, showing comparison between urban polygons
derived from OSM (green) and the Natural Earth data (light brown).

I have chosen the following places as suitable test areas for these investigations:
  • East Midlands of England. Not only my home turf, but also a well-mapped area with extensive use of landuse tags, and in excess of 99% of all residential roads. In addition Ordnance Survey Meridian 2 Open Data contains a layer corresponding to urban areas which provides an excellent control for checking results from this area.
  • Pakistan. Not only one of the most populous countries in the world, but one of the least well mapped in OpenStreetMap. Pakistan is a likely candidate for cities which are barely mapped. I would also expect other very populous Asian countries (notably China, India and Bangladesh) which are poorly mapped to be similar to Pakistan.
  • Nigeria. Similar criteria to Pakistan: the most populous country in Africa. The .pbf file for Nigeria is approximately 50% larger than that for Pakistan, but both are smaller than that for Lesotho with a population of 2 million compared to 180 million (Nigeria) and 200 million (Pakistan).
  • Côte d'Ivoire. Close to Nigeria, but a place which I know has an active OSM community. Quite a number of mapping activities. (Note to Geofabrik, it's not called the Ivory Coast any more).
  • Argentina. Latin American cities are often laid out in a grid, nowhere more so than in Argentina. The prevalence of the grid system, and my believe that the urban road system is largely complete were reasons for choosing this as a Latin American example. My own experience of travelling in Argentina after SotM-14 suggests that, for the most part, urban road systems are mapped. One known gap, the newer western suburbs of Ushuaia has recently been rectified by the kind provision of aerial imagery from the Argentine National mapping agency.
  • Pennsylvania. It was essential to include some US data  because of the TIGER import problem: all rural roads being tagged residential. Since I spent part of my childhood in Pennsylvania it is also a place I know and which I have edited (sporadically) to improve the rural road network.
Briefly I expected the following: good urban areas for the East Midlands and Argentina (i.e., better than Natural Earth (NE)); middling to poor for the three developing nations (gaps relative to NE, but in some cases better precision); hopeless for Pennsylvania.

Sunday, 25 October 2015

Urban Areas: a meditation on why simple global geographical datasets are so poor

Puerto-Vallarta
Puerto Vallarta, an aerial view of an urban area missing many roads on OpenStreetMap.
The area in the middle distance away from the sea was particularly lacking.
Fortunately the centre of the hurricane didn't pass over this area.
Source: Wikimedia Commons, (c) CC-BY-SA


The other night, as Hurricane 'Patricia' bore down on the Pacific coast of Mexico, I had a twitter conversation with Bill Morris and others regarding how well mapped Puerto Vallarta was on OpenStreetMap. (BTW: I'm sure it's much better mapped now).
Of course, OSM is about fixing things, so I carried out the conversation in between adding around a hundred streets to the city. However the really interesting question was this one:

Whilst at breakfast I thought a little more about this. I decided it ought to be possible to do something fairly simple with data which already exists.