Thursday 11 December 2014

City Stripping : building historical road layouts from todays data

Buenos Aires - Plano de Basch (1895)
Street map of Buenos Aires 1895 (plano de Basch)
Source: Wikimedia Commons

For at least one year I have been thinking about the best way to create vector streetmaps for different historical periods for the same city. The basic principle is simple: take the existing street layout and remove what is new! Essentially, this presumes that any current vector data will be more precise than digitising vectors from old maps.

I have even given the process a name: "City Stripping", but in practice my attempts so far have been less successful than I hoped. I think that  I kept too much existing OpenStreetMap data in my first attempts, because in practice I found it easier to re-trace the old data (as here for Tartu/Dorpat).

State of the Map 2014 in Buenos Aires provided another opportunity to try this out. Fortunately a small number of historical maps were available on Wikimedia Commons. I chose two:

The city government have an extensive collection of old aerial photos on their Buenos Aires map site. So it is also possible to explore other periods. It may be in the future that these old map data will become available in ways which will allow their use with OpenStreetMap and OpenHistoricalMap.

Workflow Overview

The process I used for these two maps seems to work well, so here I attempt to document it fully. The main steps are:
  • Rectify historical map images. This is best done using one of two implementations of Map Warper: the one hosted by Tim Waters (not the similar one used internally by NYPL) or the one embedded in wikimedia (use the latter for wikimedia images as this simplifies sharing: you may need to add the map template to make this easier). It is also possible to warp (georeference) a downloaded map using QGIS or another GIS tool.
  • Create a JOSM imagery reference for the rectified map. (MapWarper should provide a suitable URL)
  • Trace historical bounds of city. Using JOSM (or possibly QGIS) with the poly plugin loaded trace round the built-up limits of the city for the given historical map. 
  • Create a .poly file. Save the polygon created in .poly filter file format.
  • Grab an extract of the current data for the city (e.g., from Geofabrik's download site). Usually it is worth extracting a suitable bounding box from the extract before further processing.
  • Filter the data for only highways. Using osmfilter (or osmosis with --tag-filter) create a file which only contains highway data. A more sophisticated filter will remove certain highway tag values (e.g., cycleway, footway, path).
  • Remove unnecessary tags and ways. Optionally remove (historically) irrelevant tags from the filtered data (e.g., oneways, lit, parking restrictions, cyclelanes etc).
  • Clip the filtered data. Now clip the filtered data by the historically latest of the .poly files (I used osmconvert, but osmosis will do too).
  • Review the clipped data in JOSM to remove streets and other features not shown in the historical map.
  • Add start_date and end_date tags. (See below for discussion of what it is appropriate to choose here).
  • Load into OpenHistoricalMap.
Data from a more recent historical period can then be used as input for the same process for earlier periods. Hence the name: the city is progressively stripped back to its historical core.

It's important to remember that, as with OpenStreetMap, many aspects are open to iterative refinement (such as the accuracy of the map warping stage), but, the overall workload is considerably reduced if everything is carefully planned in advance. Given that the workflow is easily replicated it is probably best to plan on repeating the process a couple of times.

To illustrate this workflow I provide details for each step using the Buenos Aires 1870 and 1895 maps with lots of additional comments. I hope these will help people avoid some pitfalls. I've added a selection of historical photos from the period 1860 to 1899 to give a more realistic flavour of what the bare-bones vector data is representing, and to break up the text a bit.

If you are not interested in the details, you may wish to skip to the Conclusions.

Detailed breakdown of the Workflow for "City Stripping"

Map Rectification

The primary purpose of map rectification in this workflow is to create a simple crude outline polygon of the mapped area of a city from the historical map. It is therefore not absolutely necessary for the rectification to be highly accurate in the first instance. However, I would recommend using as many control points as possible. It is my experience that as soon as one views vector data and the raster historical map together all sorts of questions arise.

Normally I use OSM as the basis for rectification (normally, this is to avoid potential derived-data issues with using aerial images). This may be an issue in the countryside where there may not be enough features to provide a good range of control points (the general solution, as with the Irish Townlands project, is to map more detail on OSM!)

For control points I normally choose obvious street junctions (for instance, where streets meet at an acute angle), prominent historical buildings, bridges. Ideally they should be evenly distributed across the map area to be warped.

1870 Buenos Aires map in Map Warper with 4 control points.
The example above is rectified with only 4 control points. Alignment is not perfect and can be improved by adding more control points.

Detail of above map showing minor alignment errors after rectification.
The Buenos Aires rectilinear grid meant that control points had to be chosen with greater care than I;m used to. It is really important that one does not select an intersection 1 block out. Once the basic rectification has been created and checked it becomes much easier to add additional points, not least because errors show up more obviously.

Similar area from the 1895 map using 30 control points

Adding Imagery References to OSM editors

Map Warper has a link for JOSM WMS files but I think this currently does not work. Instead select the Export tab and select the Google/OSM Tiles scheme url. This will look something like "{z}/{x}/{y}.png/"

This works fine in iD, and should work in Potlatch 2, but needs a minor change for use in JOSM: the "{z}" needs to be replaced by "{zoom}". This should be added in the TMS section of the Imagery preferences for JOSM, in which case it should look something like this: "tms[19]:{zoom}/{x}/{y}.png".

Creating .poly Files

Poly files are used by OSM tools to clip an OSM file with a polygon. Their format is described on the wiki. In the past I used a QGIS plug-in to create these files, but I havent used this since QGIS 1.8 and don't know if it is available with current versions of QGIS. I therefore opted to create it in JOSM.

Enable Poly Plugin for JOSM

Firstly, the Poly Plugin (by Zverik) needs to be installed. On my version of JOSM this required that JOSM was restarted after I had added it to my plugins.

A polygon drawn around the 1895 built-up area of Buenos Aires in JOSM
Note rectified map layer and absence of tags
Draw the polygon in a new layer in the normal way. There is no need to add any tags to the polygon. Check that it is a polygon, and simply save it using File|Save As... choosing .poly file format.

Choose an Extract

Plaza de Mayo ca. 1896 (Archivo Witcomb)
Plaza de Mayo c.1896
Source: Archivo General de la Nación Argentina via Wikimedia Commons.
 As this process is meant to be applied to a single town or city it should be possible to start with the current OSM data for the area. However, the main provider of OSM extracts, Geofabrik, might not have a suitable small size extract. In which case a country or other larger region will need to be chosen. For Buenos Aires I had to use an Argentina file. I initially clipped with a bounding box this to a 1 degree square including Buenos Aires (using osmconvert) to provide my starting file.

The BBBike service may be a good alternative source. It can be used either for pre-generated extracts for some 200 major cities worldwide, or for a user-defined bounding box extract area can be specified on the website.

Filtering the data.

Hotel de Inmigrantes Retiro AGN
Hotel de Inmigrantes , Buenos Aires, Retiro
Located near the current Bus Station.
Source: Archivo General de la Nación Argentina via Wikimedia Commons.
 There are numerous ways of filtering the data. I find osmfilter fairly easy to use and it's easy to tweak the filters to reduce the amount of manual tidying-up required after the automated filter step. For osmfilter it is necessary to store the data in either .o5m or .osm (XML) formats. The former requires less space. If using osmconvert to prepare the initial extract then the results can be stored in this format for further procesing. If using osmosis to do the extraction then osmconvert is needed to convert the clipped file from .pbf format.

I use osmfilter with a parameter file (see here on the wiki), and the following rules work reasonably well. I'm sure they can be refined. Anyway, they may need tweaking to reflect local mapping conventions.

highway=pedestrian && area=yes


This keeps all elements tagged with highway, and then removes those which are unlikely to be needed as historical data (footways were one of the things which hindered the process in Tartu). Note that only pedestrian areas are discarded: many pedestrian streets are likely to be historically significant and therefore should be retained. Lastly, various tags which are often placed on highways or individual nodes of a highway are removed too.

Shows extent of modern Buenos Aires compared with 1895 and 1870
Original OSM Extract with historical highways for 1895 and 1890 overlaid.
Gives a good impression of the relative size of the data sets.
This step is quick, so it is worthwhile spending time tweaking the rules. You will save much time at later stages in the process.

Clean-up tags on the Highway data

Dique 2 de Puerto Madero en construcción (AGN)
Dock No. 2 Puerto Madero under construction 1891
The development of the dock area is one of the more obvious changes from 1870 to 1895.
Source: Archivo General de la Nación Argentina via Wikimedia Commons.

Before proceeding further it is important to review the data.

JOSM is perfect for this, although osmfilter has some useful features too.

Simply selecting everything in JOSM will give an idea of which tags have come through the filtering which are not needed. Either clean these up in JOSM or refine the filter used in the previous step.

A number of highways will need to be removed. These are likely to include modern flyovers, and many dual-carriageways may need one of the two carriageways removing. For instance all the bus service ways along Avenida 9 de Julio were superfluous.

Many highway segments can be merged because the differences in tagging which caused the highway to be chopped into lots of pieces has gone. This was very significant in Buenos Aires because usually a highway had one element per block to accommodate tags for parking rules, cycleways and oneway. Once again reducing the number of elements at this stage simplifies manipulation downstream.

The idea is really to do these manipulations once, and then all changes cascade down the years as the city is stripped back.

Clipping the data.

A very straightforward process. Either osmosis or osmconvert can be used. I prefer the latter, and save the results in .o5m format.

Review the data.

Now we have an OSM file which should represent the street layout of our city at a given point in history. Pull it into JOSM and compare it with the previously rectified map layer.

In the main this is about removing roads which didn't exist in the given historical period and adding roads which have since disappeared. One of the benefits of using Buenos Aires is that the gridded layout makes such comparisons very easy.

Add History tags

Perú y Avenida de Mayo
Avenida de Mayo junction with Perú. (location on OHM)
Note details of road surface and presence of sidewalks (br-en: pavements).
Source: Archivo General de la Nación Argentina via Wikimedia Commons.
Now select all tagged elements, and add start_date and end_date tag to each. Given that in this process we are creating multiple snapshot views of the city, one can choose the range in a fairly arbitrary way, providing the dates dont overlap for any two snapshots. The obvious strategies are:
  • Use the year for which the rectified map applied (so for my examples  of 1870 and 1895, this would mean using 1870-01-01 to 1870-12-31 and 1895-01-01 to 1895-12-31 as the date range.
  • Use a longer period, say, the decade ending with the map date. This more likely reflects that the map would not be completely up-to-date, and was the option I chose for Buenos Aires
  • Make the date ranges contiguous from one snapshot to the next, so the Batsch map data would have a start_date of 1871-01-01 and an end_date of 1895-12-31.
See the latter part of this post for a more detailed discussion of these tags.

Load into OHM

We are now almost ready to load the data into OHM.
  • Firstly, because the data was originally already present in OSM it has metatags for user, version etc., and it already has identifiers assigned. We don't want these OSM metadata in OHM, so in JOSM select the layer and create a new copied layer. Check that OSM elements in this new layer appear as new elements (use View|Advanced Info Ctrl-I). (In practice it might be better to do this at one of the earlier stages). Save the copied layer, noting that a careful choice of convention for file names is useful.
  • Secondly, we really dont want to reload this data back into OSM! To make sure that JOSM is pointing to OHM it is necessary to change the default API parameters using the Preferences menu. Deselect the default option. The OHM api url is Enter this in the dialog box. I'm fairly paranoid so I double check that I'm not accidentally uploading to OSM before I start the OHM load. (I also make sure I have the reverter plugin loaded and know how to use it).
Having double checked this last two steps, hit the upload button. Hopefully you will get something like this:


Basílica del Socorro (Archivo Witcomb)
Basílica del Socorro. 1880 (corner of Suipacha/Juncal)
An example of the type of historical photo needed to add details to OHM.
Source: Archivo General de la Nación Argentina via Wikimedia Commons.

Writing up the detail process revealed to my why my first attempts failed. Although each step, is in itself, straightforward, the sheer number of steps makes the whole process rather elaborate.

Probably the critical step is the filtering of original OSM data. Getting osmfilter to reduce the amount of manual correction is essential. The whole idea is to reduce time spent inputting basic data in order to provide a basis for incorporating other information.

The process also illustrates the important utility of being able to use software tools already developed around OpenStreetMap.

Other Observations

  • MapWarper could use other OSM derived layers. Warping maps uses the standard OSM CartoCSS/Mapnik layer. Often this has too much detail which is not particularly useful for creating control points. Renderings with fewer colours, perhaps aimed for use with thematic overlays, may be better for this process. Currently other alternatives are not included in the settings of MapWarper.
  • Similar approaches have been used independently. Although I envisaged using this process shortly after being at the Tartu workshop, this was the first time I really worked out a decent workflow. I know that Professor Richard Rodger's group at Edinburgh University have been using a very similar concept within their social history project MESH.
  • Make Notes. Even when using this process as a fairly routine workflow one notices lots of things about the city layout which raise questions about how it developed. For instance in Buenos Aires, neither Avenida 9 de Julio nor the two Diagonals existed in 1895. No doubt if I lived there I'd know why, but as it stands this is something which I need to go and research (actually the former is not present in aerial photos from 1940). Many such questions come up: it's best to keep track of them. Such questions will often be meat-and-drink for many of the uses of OpenHistoricalMap data (see my post on historical road layouts for some detailed examples).
  • Making historical maps is a research process. Unlike classical OSM where map making is about surveying, OHM data involves research. The research itself is at least as important as the map making: ultimately historical maps are used to tell stories. Linking the map, the research, and a range of archival materials means storytelling. Susanna has already started discussing this aspects in the context of the Nordic Wikimaps project; and for instance Peoples' Collection Wales framework is meant to support such an approach.


Creating this data is really just the first step. In a future post(s?) I will write about how the two data snapshots of the city can be visualised and used.

No comments:

Post a Comment

Sorry, as Google seem unable to filter obvious spam I now have to moderate comments. Please be patient.