The Google Summer of Code Doc Summit and OSM

For 5 days in October the Google Summer of Code Doc Summit, organized together with FLOSS Manuals, will bring together four documentation teams from open source projects, guest speakers, and free documentation ‘free agents’ to discuss everything and anything concerning the free documentation of free software. The event will feature a two day unconference and a three day Book Sprint. During the Book Sprint each project will produce a Book ready for distribution in print and electronic book formats.
The event is an ambitious project. Not only are unconferences about free software documentation scarce, never before has a Book Sprint been attempted with four projects working simultaneously on their own book. It’s going to be an extremely interesting and challenging event.
Free software documentation has often been a very low priority for free software projects. Often the documentation suffers from common flaws including:
  • no documentation existing at all
  • assumptions about the user’s knowledge are set too high
  • poor navigation
  • unexplained jargon
  • there is no visual component
  • the documentation is proprietary or ‘closed’
  • the format is unreadable
  • no translation workflow
  • operational steps are missing, unexplained, written ‘from memory’ or state how the software ‘should’ operate
  • the documentation is out of date, not easily re-usable or not easily modifiable.
The Google Summer of Code Doc Summit will attempt to discuss and address these problematic issues and look towards positive models for documentation production. We hope to shine light on the importance of the free software documentation ‘sector’ in the ecology of Free software. Free (libre) documentation is not simply an aid for learning how to use free software, it is a road into education and adoption in industry, a tool for demonstrating to clients how free software will meet their needs and expectations, and an important promotional tool for the advancement of free software. A healthy free documentation sector is both socially and economically empowering. We believe Free Documentation of Free Software efforts and ideals should be valued on the same level as free software itself and that is exactly what we plan to do at this Summit.
The Google Summer of Code Doc Summit is more than a think tank and an opportunity to discuss real world issues. Four projects, OpenMRS, KDE, Sahana, and OpenStreetMap, will have a chance to directly strengthen their documentation efforts. We look forward to working together with each of the selected teams and individuals to help them produce their own book by the end of the five day summit.
It’s going to be a great event.

Render with Mapnik

I explained a little about the process by which you can overlay raster data on Bing Maps, by geo-referencing the source data (if necessary), projecting into the appropriate spherical Mercator projection, then cutting the resulting image into 256px x 256px tiles named according to the quadkey tile numbering system.

The application I used in the last example to do this was Microsoft Mapcruncher. Mapcruncher has got a lot of benefits:

  • It’s free
  • It’s a windows application with no special dependencies and easy to install
  • It’s got a nice GUI that’s very easy to use
  • It will perform geo-referencing/warping/tile-cutting for you as part of a single process

Mapcruncher is great if you have a relatively small raster image that you want to integrate into Bing Maps / Google Maps as part of a one-off process. However, there are many occasions when you might need a little more than that. As part of a current project, for example, I need to create a tileset that covers an area of approximately 60km x 50km (not that small), based on data held in SQL Server 2008 (not raster), that will be updated approximately every month (not one-off).

So I set out to evaluate alternatives, and the first I considered was mapnik.

Installing Mapnik

Mapnik is an open-source mapping toolkit. It’s what openstreetmap uses to render the tiles used in their base map imagery. They’ve got over 220Gb of XML data in the planet.osm file, covering the whole world, with thousands of updates every day, so if mapnik can render that I’m pretty certain it should be able to cope with my modest requirements.

Mapnik tiles in Open Street Map

Installing and configuring mapnik, unfortunately, turned out to be quite a challenge, and has taken me a significant amount of time to get a working installation. In fact, had I realised quite how many steps would be involved, I’d probably have taken more care to document them carefully. As it is, I’m going to try to note down in this post what I did while they’re still relatively fresh in my mind.

What about using pre-compiled Windows binaries?

Mapnik, like many open source applications, is primarily targeted at a UNIX stack. That means that a large part of the documentation will refer to commands like sudo apt-get, which will look fairly alien to many Windows users. There’s also a lot of dependencies on other packages – python, libboost, libpng, etc. which you may not be familiar with.

Now, fortunately, some kind people have taken the time to prepare pre-compiled Windows binaries of these various tools, and even packaged them together in convenient download format.

For example, the OSGeo4W package contains windows binary executables for Mapnik, along with the GDAL library (used for importing, converting of various spatial data formats), QGIS (open source desktop GIS application), and many other useful open source spatial goodies. The problem is that, with trying to keep track of all those separate dependencies, the package itself can become out-of-date quite quickly. The latest build of OSGeo4W, for example, is still based on python 2.5.2-1. The current build of the 2.x branch of python is 2.7.1, and even that represents the end-of-life release for the 2.x branch. The latest version of python is actually 3.2. I didn’t really want to start a project on software that had already been deprecated.

Likewise, the excellent FWTools project, which also bundles together many of the same packages as OSGeo4W still comes bundled with Python 2.3.4 and version 1.7 of the GDAL library. Crucially, for me, GDAL introduced support for SQL Server as a spatial data source only in version 1.8, so using FWTools wasn’t an option either.

What’s more, Mapnik itself seems to have forked into two versions – the current stable release being 0.7.1, but with many comments being made about breaking changes in the new 2.x development version. The only precompiled windows binaries I could find were of the 0.7 version and, again, I didn’t want to invest a lot of time setting up a project based on software that was about to go out of date.

So, precompiled binaries was, at this stage, a no-go.

Build-It-Yourself

I decided I was going to have to build my own installation, and here was my wish list:

  • Python 3.2
  • GDAL 1.8
  • Mapnik 2.x

Python was (thankfully) easy to install – there’s an installer package available from http://python.org/ftp/python/3.2/python-3.2.msi

GDAL was also not too bad – there are x86 and x64 packages (bundled with mapserver) that you can download from http://www.gisinternals.com/sdk/

Now onto Mapnik. And it is here, with retrospect, that I wish I’d stated taking notes. You can download the source for mapnik from here. However, before compiling mapnik itself, you need to download and/or compile its required dependencies. These are: proj4, boost, zlib, freetype, icu, libxml2, libpng, libjpeg, and libtiff.

Now, you could download the source for each of these and build them separately but fortunately, once again, some kind soul has done much of this work for us. If you go to the gnuwin32 project on sourceforge, you’ll find links to download most of these libraries. For those packages not included in gbuwin32, ICU is available from here, and you can get the latest zlib from here.

Once you’ve got all the dependencies sorted out, it’s onto the configuration changes. If you follow the article at http://trac.mapnik.org/wiki/Python3k you’ll see a number of steps required to rebuild the python bindings with Mapnik to target Python 3.x. After some fiddling about with these and a bit of guesswork, I managed eventually to get everything built.

Testing it Out

Mapnik comes with a test script so, gingerly, I tried running it. Lots of errors – couldn’t find xxx etc. I realised this was because I hadn’t set the environment variables and paths correctly so, after sorting this out, I had another go. This time looked a lot more promising:

image

And, lo and behold, here was the (beautiful) example image generated:

demo_high

Over-confident of my new found ability, I then tried altering the example rundemo.py script to point at a SQL Server datasource. Mapnik supports OGR datasources, so I first created a virtual layer that connected to my SQL Server. For the purposes of testing, I decided to select a set of data from the OS VectorMap District settlement area data (note that I’m not sure if OGR can deal with SQL Server’s native binary geography/geometry format, so I use STAsText() to get the WKT representation and then specify WKT encoding in the GeometryField):

[php]

MSSQL:server=.SQLEXPRESS;database=OSVectorMap;trusted_connection=yes
SELECT geom27700.STAsText() AS geomWKT FROM TG11_Settlement_Area</pre>
[/php]

Testing the virtual layer with ogrinfo seemed to suggest that everything was working ok:

image

So then I modifed the python script to add the new layer. Note that OS Vectormap data is defined using the OSGB British National Grid coordinate system (EPSG:27700), and mapnik expects the parameters for the srs property to use PROJ4 syntax, which you can get from http://spatialreference.org/ref/epsg/27700/proj4/:

[php]vectormap_lyr = mapnik.Layer(‘OS Vectormap’)
vectormap_lyr.srs = "+proj=tmerc +lat_0=49 +lon_0=-2 +k=0.9996012717 +x_0=400000 +y_0=-100000 +ellps=airy +datum=OSGB36 +units=m +no_defs"

vectormap_lyr.datasource = mapnik.Ogr(file=’MSSQL.ovf’,layer="AASQLlayer")</pre>
[/php]

Unfortunately, here I hit another problem:

image

And it seems here my luck has run out, because I simply can’t work out how to get round this one. The error message is simply “Failed to open datasource”, yet ogrinfo confirms that the datasource is fine, and other GDAL/OGR components can read from it, so I don’t know if it’s because I built mapnik wrong or forgot to change/include a particular setting.

I did just notice that there’s a Google Summer of Code Project to improve Mapnik installation on Windows, and I’m really hoping it’s successful because the results generated by Mapnik are beautiful.

As a workaround, I actually used OGR2OGR to export my data from SQL Server into a shapefile called MSSQL_export.shp, and then used that as a datasource in Mapnik by changing the python datasource to:

[php]
<pre>vectormap_lyr.datasource = mapnik.Shapefile(file=’MSSQL_export’)</pre>
[/php]

Finally, after a bit of XML styling, I was able to get the following image (click for full size):

image

I’m actually really pleased with the image quality, but until I can get Mapnik to retrieve data directly from SQL Server there’s not much point proceeding with the tilecutting process – I can’t really justify an additional step of exporting from SQL Server to shapefile just to get Mapnik to load it.

If anyone has had any success of getting Mapnik and SQL Server to play nicely together, please let me know!

OpenStreetMap (Ramm, Topf and Chilton)

OpenStreetMap: Using and Enhancing the Free Map of the World
by Frederik Ramm, Jochen Topf and Steve Chilton
UIT Cambridge, 2010. Paperback, 352 pp.
ISBN 978-1-906860-11-0

Book cover: OpenStreetMap Last year saw the publication in English of two books about OpenStreetMap. This one, Frederik Ramm and Jochen Topf’s OpenStreetMap, saw three German editions before being translated into this English edition, which Steve Chilton assisted with.

This is a comprehensive manual on using OpenStreetMap and its data, covering everything from contributing user data to editing, to using and hacking OSM data on websites and in applications. In other words, it covers everything — though not necessarily in thorough detail, with lots of references to OSM wiki pages for more information.

Now I’ve always found the OSM wiki to be a bit overwhelming; I think that this book does a better job of getting people up to speed on using OSM than trying to navigate the wiki pages (which is how I got up to speed, and wished for something clearer). Those who spend a lot of time on OSM will do well to have this on their shelf.

I think OSM needs more contributors, at least in Canada, where edits I left unfinished months ago are unchanged when I get back to them. So I read this book with an eye as to whether it would help beginners contribute. The first two parts of the book do a very good job of introducing the mapping process — collecting tracks, editing map data — to beginners, or at least that’s my impression. I even learned a couple of new things, and I’m a little less trepidatious about using JOSM (all my edits to date have been with Potlatch).
 


 
But people who are only interested in uploading GPS tracks and editing the map, rather than using OSM data in mashups and applications, won’t need to read past page 160.

Things move fast in the tech world, and the book has already been overtaken in one regard: most of the examples use Potlatch 1, which has been replaced by Potlatch 2 as the default web editor; I had to work to remember how to use the old editor. Serves me right for taking so long to get to this review.