Wednesday, 24 April 2013

Segmentation of Retail Landuse: why do Germans only map shops?

Retail Landuse in Karlsruhe on OSM (scale 1:50k) :
both explicitly mapped and derived landuse polygons are shown, see below for methods.
The availability of lots of open data on various kinds of retail outlets has led to me doing a lot of maps of shops, restaurants, fast-food outlets and bars lately. I'm following in the steps of Paul the Archivist who mapped the Mansfield Road area close to pub meetings we had in 2011. I've got a nice workflow : I map my target area for an hour on Sunday morning, usually trying to get all retail establishments in a quite small area. I also try and collect as many house numbers as I can. I discovered my new mapping protocol, because my mother asked me to drive her to church a few weeks ago and I found it very productive. It also means I've been actively surveying in a neglected inner-city area.

One effect is that I've been adding (and tidying up) local retail landuse in OpenStreetMap. And then, in my twitter timeline, I came across an article on Mapcite with the following image:

Geolytix Retail Classification and rent profiles for Nottingham
source Mapcite Blog copyright Geolytix.
This is produced by professional retail analysts. Perhaps if it didn't cover the Nottingham area it might have not grabbed my attention so immediately. However, it is worth pointing out that Nottingham is the heart of the 8th largest urban area in the United Kingdom, and is the 8th largest retail centre. It is therefore has a good degree of complexity, but is not so large that it cannot be conveniently visualised. In a way it supports my belief that by looking at the particulars of mapping in one area, one can discover things which are more generally applicable.

Besides the immediate reaction of "How many of these areas do we have mapped out in OSM?", I was much more interested in the classification (dare I say it, ontology) of retail landuse shown in the upper part of the legend. Below, I give my personal interpretation of the Geolytix categories  :
  • Town and City Centres (Major City, City , Large Town , Town, Small Town) : I think these are self-evident, and will mainly depend on the size of the town/city, although there may be some refinements based on catchment area, range of shopping categories. In major conurbations such as London, I am not sure how one treats places like Kingston-upon-Thames, Ealing, Croydon all of which are high up the lists of major retail locations in the UK.
  • Village Centre: villages with a range of shops are fast disappearing even in the most prosperous parts of Britain. I had difficulty finding a suitable photo (see below) from Geograph as many places I looked had images of closed shops, or showed a group of shops indistinguishable from the parade (see below). However, I would expect greater diversity, both in the range of shops and the type of retail premises. The example I've chosen is the prosperous Surrey village of Ripley (where Eric Clapton grew up).
  • Urban Centres (Major Urban, Urban): I'm not quite sure what is meant by this, but imagine it refers to strong concentrations of retail premises outside the city centre. By my interpretation this might include places like Altstetten, Langstrasse, both in Zurich; Radford Road in Nottingham.
  • Local Hub: Again I'm not sure what geographical unit this might refer to, and I can't even speculate.
  • Parades (Strong Parade, Parade, Weak Parade). To me a Shopping Parade is associated with London suburbia. A parade is a row of shops in the middle of suburban housing, often with flats above the shops. Historically, the parade replaced the earlier corner shop as housing changed from dense urban patterns to less dense sub-urban patterns. In the beginning the classic parade would have had a Post Office, a greengrocer, a butcher, a grocer, a newsagent, a chip shop, a barber or hairdresser, and perhaps a pharmacy or bakery. These days they will still have a hairdresser, but the post office, newsagent, butcher, grocer etc. will have been replaced by one or more convenience stores, and there will be several fast food outlets. I have no real notion as to whether other countries have similar retail groupings. Presumably strong and weak are mainly size indicators.
  • Retail Park: Warehouse style sheds with extensive parking, usually within an urban conurbation. Frequently on former brownfield land, such as old railway sidings. Numerous examples local to the East Midlands (Wyvern Park, Castle Marina, Netherfield, Lady Bay). Many in the United States, such as this one outside Hyannis, MA.
  • Regional Shopping Centre: I presume this refers to places such as Lakeside, Meadowhall, Barrhead etc, in the UK; or Mall of America in Burlington, MN.
  • Outlet Centre: examples Clark's Village, Street; Cheshire Oaks, Ellesmere Port; numerous examples near Mendrioso (CH).
  • Airport: self-evident, major airports in the UK have been shopping malls for a long time. It used tobe that 10% of all books sold in Britain were bought at Gatwick Airport.
  • Rural:specialist farm shops, garden shops etc.
And some photo examples, all from Geograph (for credits click on the images):

SK5839 : Lady Bay Retail Park by Alan Murray-Rust
SK5644 : Shops on Arnold Road by Alan Murray-Rust
ST4836 : Clarks Village Outlet Shopping by Nigel Freeman
TQ0556 : High Street by Stuart Logan
High Street looking north - - 857546

There are no doubt other classes which might be more useful in OSM. For instance marketplaces (already marked by amenity=marketplace), bazaars, shopping galleries and arcades (e.g., Burlington Arcade, London, Exchange Arcade, Nottingham, and Galeria Vittorio Emanuelle II in Milano) traditional covered shopping centres (e.g., Victoria Centre, Nottingham; Letzipark, Zurich; Cityplaza at Taikoo Shing, Hong Kong; Galeria, Krakow - note in this case I'm using the British English Shopping Centre rather than the American English Mall, which is used in the wholly inadequate tag shop=mall).

I thought therefore it would be interesting to see if we could learn anything from OSM about categories of retail. I turned to Germany and extracted all landuse=retail areas, all shops and bars, restaurants, pubs and fast food outlets. I had hoped that retail landuse would be mapped as thoroughly as other things in Germany, but inspecting a few major cities (Karlsruhe, Munich, Hamburg) revealed that although shops have often been mapped in detail, retail areas, particular in large cities have not been mapped. My plan was to create a dataset suitable for pushing through Rapd Miner to see if there were any interesting classifications generated by a decision tree. In the absence of a meaningful set of retail areas from the German extract this plan is in abeyance.

Instead I have been trying to use shop data to see if I can derive a reasonable set of retail landuse polygons for Germany myself. So far I have made reasonable progress:
  • I downloaded the latest Germany extract from Geofabrik.
  • Using osmconvert and osmfilter, I extracted landuse=retail, place=*, shop=*, amenity=restaurant = pub =bar =fast_food and loaded these into a Postgres database using osm2pgsql
  • I added additional geometry columns to all planet_osm tables and updated these with geometries in the ETRS89 projection as this enables sensible distance and size measurements.
  • For each shop I calculated a 100 m grid location within ETRS89.
  • The distinct set of grid locations were then used as the basis for retail polygons
  • Adjacent grid squares were detected using PostGIS functions
  • Contiguous groups of grid squares were identified using the same graph traversal algorithm described in an earlier post.
  • Concave and convex hulls were calculated (or attempted to be calculated) based on the locations of shops within each grid-based polygon. (These should basically provide a slightly more realistic outline of a potential retail area than the original polygon).
The top image shows Karlsruhe, below I show Hamburg, Berlin and Munich. In each I show existing retail polygons: those in which I found shops in the bounding box, and those in which I found no shops; the polygons generated from ETRS 100m grid squares; and an attempted improvement of these using convex hulls in Postgres. In each city, most retail landuse has not been mapped explicitly.

This is preliminary work, so I'm not sure why I did not find any shops within the polygon for Viktualienmarkt in Munich, and I have not used food and drink outlets which will affect the size of retail polygons. I also had problems generating better concave hulls in PostGIS (the function gave errors on a small number of polygons).

Retail Landuse in Munich (scale 1:50k)
Retail Landuse in Hamburg (scale 1:75k)
Retail Landuse in Berlin (scale 1:125k)

Munich clearly illustrates the limitation of an approach solely based on shop locations as distinct retail areas, such as the city centre and Schwabing run into each other. It is clear that local knowledge in mapping can discriminate between different areas far better than this: I would therefore advocate more intensive mapping of retail areas in Germany! To support such mapping a shape file or OSM format XML is available from the author by email.

Lastly, my original goal of finding automated segmentation methods for retail classification from OSM data is still not achieved. Perhaps I'll be able to report on this soon.


  1. Hi Jerry,

    In London the following hierarchy is used:

    International Centre (e.g. the West End)
    Metropolitan Centre (e.g. Croydon, Ealing)
    Major Centre (e.g. Peckham, Camden)
    District Centre (e.g. Crystal Palace, Twickenham)

    You can see a map of these in chapter 2, page 64 of the London Plan:

    The fuller definitions and a full list is in the annex 2 from the same page.

    It's a lot more blunt than the hierarchy you mention, and completely ignores the massive Westfield shopping centres and various retail parks. But it works quite well as a hierarchy for place=city/town/village/hamlet.

    Maybe you could run your analysis in central London, where successive mapping parties have tried to map retail land uses and shops.

  2. Hi Tom,

    Thanks for the input. I'm sure that hierarchy would map fairly compfortably onto the City/Town centre perspective used by Geolytix. It certainly answers my question about how the areas might be handled in London.

    However, I suspect that a planners view might be different from that of someone interested in store location.

    Central London is too small and too distinct to try and find segmentation algorithms, and it's the rural and suburban areas where I think these might be most interesting.

    I think I need a random sample of around 1000 retail areas to have a chance of getting reasonable decision trees. In the UK we have 63,141 shops mapped, Germany has 210,988 (i.e., 250% more on a like-for-like (population) basis. We do have many more of landuse=retail polygons mapped in the UK (around 10,000 compared with 6,208 in Germany).

  3. Maybe just a small comment on why so few areas are mapped in Germany: judging by the usual discussions on the talk-de mailing list, there is (for any mapping) a strong focus on the "ground truth" rule in the German community.
    Putting areas or other information into the database that can not be observed on the ground but have to be extracted from official (or sometimes not-so-official) documents is often frowned upon.
    A lot of those information are highly specific, not verifiable, not maintainable (without the help of the original information provider) and often even subjective to the provider. (In your case, there might be more than one "retail analyst" with different zone definitions.) In addition, there might be copyright problems if it is not provided, but merely obtained.

    More examples (which have been discussed) are the delivery zone of fast-food delivery services, borders of cadastral districts, informal place names as areas, drainage area of rivers, names of hiking routes (= specific combination and order of ways) etc.