OGR2OGR Importing Spatial Data to SQL Server

A few months back, I posted an article explaining how to import spatial data into SQL Server 2008 from any format supported by the OGR library (including ESRI shapefiles, GML, and TIGER data), using OGR2OGR. That article was written using OGR2OGR from v1.7 of the GDAL 1.7 library, which doesn’t support SQL Server 2008 directly, so I instead used OGR2OGR to create a CSV file containing spatial data in Well-Known Text format and then parsed that data in SQL Server using the STGeomFromText() method.

The good news is that things have become a bit easier since then, and version 1.8 of the GDAL library now has a MSSQLSpatial driver that can interface directly with geometry and geography data in SQL Server 2008.

The bad news is that most of the places that offer pre-compiled GDAL binaries for Windows have yet to update to the new version. FWTools, for example, still comes packaged only with v1.7. Likewise, the osgeo download site at http://download.osgeo.org/gdal/win32/ also lists GDAL versions up to v1.7.

So, if you want to get hold of the latest GDAL to import directly into SQL Server you’ll have to build it yourself from source, which can be downloaded from http://download.osgeo.org/gdal/gdal180.zip

Fortunately, the source has been very considerately packaged, and includes solution files that will build GDAL out-of-the-box in VS2005, 2008, and 2010. Simply load the .sln file, click build, and wait a few minutes:

Building GDAL 1.8 in VS2010

Then, if you look in the output directory (warmerdabldbin, by default) you should see a lovely collection of utilities for working with spatial data – GDAL (for working with raster data), and OGR (for its vector sibling).

Here’s the output of calling ogr2ogr –formats, which retrieves the list of supported vector spatial formats – note the MSSQLSpatial format supported for both read/write:

image

Example usage to load a shapefile to SQL Server as follows:

ogr2ogr -overwrite -f MSSQLSpatial "MSSQL:server=.\MSSQLSERVER2008;database=spatial;trusted_connection=yes" "TG20.shp"

 

Bing Maps: How to Overlay Weather and Traffic Conditions

How to overlay traffic and weather information on Bing Maps.

Weather

You can get weather information from the U.S. National Oceanic and Atmospheric Administration (NOAA). The NOAA expose a number of WMS layers showing, for example, cloud coverage, real-time radar data, wind speed and sea levels across the U.S. and territories. (Sadly, I’m not aware of an equivalent data source for the rest of the world). You can find information on the various layers available from http://nowcoast.noaa.gov/help/mapservices.shtml?name=mapservices

The data is exposed as WMS layers, so start by following my previous posts explaining how to access and overlay WMS layers on the AJAX v7 or Silverlight control. Replace the URLTemplate in these examples with the URL of the NOAA WMS service, as follows:

string urlTemplate = “http://nowcoast.noaa.gov/wms/com.esri.wms.Esrimap/obs?service=wms&
version=1.1.1&request=GetMap&format=png&BBOX={0}&SRS=EPSG:4326&width=256&
height=256&transparent=true&Layers=RAS_RIDGE_NEXRAD”;

This example retrieves the RAS_RIDGE_NEXRAD layer, which is a RADAR mosaic for CONUS, Puerto Rico, Hawaii, Alaska, and Guam. When overlaid on Bing Maps, it looks like this (illustrating the weather currently affecting the Mississippi river area):

image

If you want to add several different weather layers from the NOAA and control them separately you can make several separate requests to the WMS service, changing the URL template each time to request the appropriate layer. Otherwise, you can make a single request that merges several types of information in one layer, by passing a comma-separated list in the LAYERS parameter. E.g. to retrieve a single layer that displays both the land surface temperature (OBS_MET_TEMP) and the sea surface temperature point observations (OBS_MAR_SSTF) in a single layer, you can use the following URL template:

string urlTemplate = “http://nowcoast.noaa.gov/wms/com.esri.wms.Esrimap/obs?service=wms&version=1.1.1&request=GetMap&format=png&BBOX={0}&SRS=EPSG:4326&width=256&height=256&transparent=true&Layers=OBS_MET_TEMP,OBS_MAR_SSTF”;

Traffic

Bing Maps v6.x contained an inbuilt option to display traffic using the VEMap.LoadTraffic method. This method does not exist in v7 or in the Silverlight control, but you can still access the same tileset as used by the v6.x control. The URL at which these tiles are located is:

http://t0.tiles.virtualearth.net/tiles/t{quadkey}.png

Note that, this time, these are tiles that have already been cut into the Bing Maps quadkey system, so you don’t need to add the intermediate WMS handler step as with the weather example above. Instead, you can directly add a tilesource pointing to the traffic tile data as follows:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html>
  <head>
   <title></title>
   <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
   <script type="text/javascript" src="http://ecn.dev.virtualearth.net/mapcontrol/mapcontrol.ashx?v=7.0"></script>
   <script type="text/javascript">
     var map = null;
     var tilelayer = null;
     function GetMap()
     {
      // Initialize the map
      map = new Microsoft.Maps.Map(document.getElementById("mapDiv"),{credentials:"Your Bing Maps Key",
 center:new Microsoft.Maps.Location(47.9,-122), zoom:9, mapTypeId:"r"});

      // Create the tile layer source
      var tileSource = new Microsoft.Maps.TileSource({ uriConstructor: 'http://t0.tiles.virtualearth.net/tiles/t{quadkey}.png' });

      // Construct the layer using the tile source
      tilelayer= new Microsoft.Maps.TileLayer({ mercator: tileSource, opacity: 1.0 });

      // Push the tile layer to the map
      map.entities.push(tilelayer);
     }
   </script>
  </head>
  <body onload="GetMap();">
   <div id='mapDiv' style="position:relative; width:640px; height:480px;"></div>
  </body>
</html>

image

Sadly, once again, this data is for the US only. Also note that there are a few clauses in the Bing Maps Terms of Use specifically governing the use of traffic data.

PHP and UTF-8 BOM

I recently wrote some PHP for the first time in ages, and noticed some of my pages were appearing on one development machine, in some browsers, preceded by the characters . These characters didn’t show up when editing the pages, and they didn’t show up at all when served from a different server or when viewed in some other browsers.

Initially, I thought that it was something to do with not having configured the correct character set in the response header (which is generally the main cause of garbled characters appearing in webpages), but, checking the response header it seemed ok – I was outputting UTF-8 as desired:

header('Content-type: text/html; charset=UTF-8') ;

And browsers viewing the page were correctly auto-detecting the character encoding as UTF-8:

 


 
image

Then I checked the configuration of the server, which was also set up with Unicode support correctly. And then I checked the encoding of the PHP scripts themselves, which were all encoded using Unicode UTF-8 – (Windows Codepage 65001). So far, everything seemed consistent, so where were those garbled characters coming from?

UTF-8 with or without signature – your choice. (Or not).

The reason, as I found out, was that one of my development environments (Visual Studio – from which I’d made the most recent edits to the affected pages) was configured to save UTF-8 encoded files with signature. Here’s the options for Unicode character encoding in Visual Studio, showing UTF-8 both with and without signature (notice that they’re both the same codepage – 65001):

image

There seems to be very little convention or standardisation as to the use of this “signature”. I hadn’t really come across this problem before because I generally use Eclipse for PHP development. The encoding options there are shown below:

image

Notice that, although there are several flavours of UTF-16 available in Eclipse, there is only version of UTF-8, which is equivalent to Visual Studio’s without signature.

Then again, here are the options in Windows Notepad (yes, I use that sometimes as well). As in Eclipse, there is only one choice of UTF-8, but this time the sole option available  provides the opposite behaviour – always saving UTF-8 with signature:

BOM BOM!

The optional “signature” in question is the Byte-Order Marker, or BOM. A byte-order marker is required for multibyte encoded data, including UTF-16, to indicate big-endianness or little-endianness – the order in which bytes are arranged. All of the save dialogs above give you the choice for specifying the byte order for Unicode UTF-16, since in a multibyte format the byte order matters. However, for UTF-8, which uses only a single byte for each character (that’s what the “8” stands for – 8 bits = 1 byte) a BOM is not required and doesn’t really make sense.

Even though UTF-8 always uses the same byte-order, a UTF-8 encoded file can begin with the bytes EF BB BF, which merely signifies that it is in UTF-8 format. It’s not really a BOM, hence why Visual Studio calls it a “signature”. The problem is that some clients don’t expect UTF-8 to have a BOM and, as it turns out, the PHP engine is one of them. At least,some builds of the PHP engine. One of my PHP servers, running on a linux machine, interpreted the UTF-8 file with signature fine, whereas another, running under Windows, tried to display the leading bytes as content on the page, which is how you end up with .

The combination of different default encoding behaviours across different editors combined with different server/browser behaviours when interpreting UTF-8 files with BOM means that this problem can be a little tricky to diagnose.

This is reported as a PHP bug at http://bugs.php.net/bug.php?id=22108, but the workarounds are actually quite straightforward (once you know what the problem is!):

  • If you’re using Visual Studio, make sure you save your PHP files as UTF-8 withoutsignature. If you’re using Eclipse, this is the default anyway.
  • Compile your PHP with the –enable-zend-multibyte option, which will correctly parse the BOM at the start of the file
  • If you don’t need unicode at all, you could use ISO-8566-1, or another non-UTF-8  encoding

How To Customize Mailing Labels

Adding a Custom Field to a Mailing Label

The first task was to modify the default fields displayed on the mailing label (name, address, postcode etc.) to include a custom field. This organisation delivers a lot of its local mail by hand rather than through the royal mail, so the custom field was used to assign addresses into a delivery walk, and this was to be printed on the label so it could be given to the appropriate deliverer.

Fortunately, CiviCRM uses a token replacement system, and every user-created custom field is assigned a unique token. The only problem in including the deliverer custom field on the mailing label  was finding out what the token corresponding to the appropriate field was.
 


 
I’m sure there must be a better way, but the quick solution that worked for me was to bring up the “Print PDF Letter for Contacts” action from a search result

On the next screen, click the “Insert Tokens” link at the top right to be presented with a modal dialog box listing all the names of all the available tokens in your system. Clicking on the name of a token will insert it into the PDF letter, allowing you to see the token behind it. In the screenshot below, for example, the Geo Code 1 field associated with each individual contact can be inserted into any mailing (or mailing label) using the {contact.geo_code_1} token.

Having identified the appropriate token for a custom field (in my case, it was{contact.custom_4} ), you can then add this to the mailing label template by going toAdminister >> Configure >> Global Settings >> Address Settings and editing the Mailing Label Format to include the relevant token, as below:

{contact.addressee}
{contact.street_address}
{contact.supplemental_address_1}
{contact.supplemental_address_2}
{contact.city}{, }{contact.state_province}{ }{contact.postal_code}
{contact.country}
{contact.custom_4}

Sorting the Mailing List

Having ensured that the custom field was printed onto each label, the next step was to sort the list of labels based on the value of that field. As explained before, the mailings were to be hand-delivered, and it would save a lot of time sorting through the envelopes if they could be printed ready-sorted into batches for the appropriate deliverer, as specified by the custom field. A quick search of the internet reveals that sorted mailings are also required in other circumstances as well, such as sorting by ZIP code when submitting batch mailings to the U.S. Postal Service. However, although reported as a major bug in CiviCRM, the ability to sort mailing lists in anything other than alphabetical surname order of addressee is currently not possible. So I had to hack something together.

A snoop around revealed that the bulk of the work in preparing the mailing labels occurred in the script located at \sites\all\modules\civicrm\CRM\Contact\Form\Task\Label.php. Within the CRM_Contact_Form_Task_Label class in this file, I created a new comparison function which would sort an array of the items to be included in the mailing list based on the custom_4 field, as follows:

function cmpCustomField($a, $b) {
  if ($a['custom_4'] == $b['custom_4']) {
    return 0;
  }
  return ($a['custom_4'] < $b['custom_4']) ? -1 : 1;
}

Through a bit of experimentation, I determined the best place to sort the records was after they had been run through the “Merge same address”/“Merge same household” scripts, but before being formatted for display. So I inserted a call to the PHP uasort function (sorts an associative array while keeping indexes) that used my comparison function at around line 330, as follows:

...
if ( isset( $fv['merge_same_household'] ) ) {
  $rows = $this->mergeSameHousehold( $rows );
  $individualFormat = true;
}

// INSERT CUSTOM SORT HERE
uasort ( $rows , array($this, 'cmpCustomField'));

// format the addresses according to CIVICRM_ADDRESS_FORMAT (CRM-1327)
require_once 'CRM/Utils/Address.php';
foreach ($rows as $id => $row) {
...

The mailing list labels were now produced in nicely ascending order based on the custom field.

Changing the Mailing Label Paper Template

CiviCRM supports a handful of common avery label formats out-of-the-box, but this organisation had recently purchased a bulk load of non-standard labels (A4, 24 labels per page, 8×3) which did not fit any of the supplied templates. They therefore couldn’t be used.

To create a new paper template, changes to two files must be made (both called “Label.php”, but in different directories).

Firstly, in the \sites\all\modules\civicrm\CRM\Utils\PDF\Label.php look for the $averyLabels array and add a new element to the array with the margins and other dimensions of your label sheet. Here’s the settings I used for my new paper (NX and NY are the number of columns and rows of labels on the sheet, width and height determined by dividing the overall dimensions of the page by the number of labels, and the margins determined mostly by guesswork):


$averyLabels = array (
  ...
  'Custom_3x8' => array(
    'name' => 'NGP_3x8',
    'paper-size' => 'A4',
    'metric' => 'mm',
    'lMargin' => 5,
    'tMargin' => 10,
    'NX' => 3,
    'NY' => 8,
    'SpaceX' => 0,
    'SpaceY' => 0,
    'width' => 70,
    'height' => 37.125,
    'font-size' => 9
  )
);

Having defined the settings for the Custom paper template, to actually get these settings to be selectable when producing mailing labels, you also need to edit the file at\sites\all\modules\civicrm\CRM\Contact\Form\Task\Label.php

Near the top is a buildQuickForm() function that contains an array of templates to populate a select list. Add the name of your new paper template to the end of the array, exactly as you defined it in the $averyLabels array above:

$label = array(
  "5160" => "5160",
  "5161" => "5161",
  "5162" => "5162",
  ...,
  "Custom_3x8" => "Custom_3x8"
);

Save the file, and your new template will magically become available for use:

A few tweaks…

Finally, there were just a few tweaks to be made. The new paper template should be always be selected by default for mailing labels (they’d bought reams of the stuff, so it was going to be used for all labels for quite a while yet), and also the “merge contacts with the same address” should always be selected. Adding these defaults was a simple matter of amending the $defaults array declared in the setDefaultValues() function near the top of the \sites\all\modules\civicrm\CRM\Contact\Form\Task\Label.php file.

function setDefaultValues()
{
  $defaults = array();
  $defaults['do_not_mail'] = 1;

  // Modified by AA
  $defaults['merge_same_address'] = 1;
  $defaults['label_id'] = "Custom_3x8";

  return $defaults;
}

I’m no CiviCRM expert, and I’ve certainly no idea if the approach I took to solving any of these problems represents “best practice” in any way, but it seems to have got the job done and met the immediate needs of my client, so hopefully it might be helpful to somebody else too!

Render with Mapnik

I explained a little about the process by which you can overlay raster data on Bing Maps, by geo-referencing the source data (if necessary), projecting into the appropriate spherical Mercator projection, then cutting the resulting image into 256px x 256px tiles named according to the quadkey tile numbering system.

The application I used in the last example to do this was Microsoft Mapcruncher. Mapcruncher has got a lot of benefits:

  • It’s free
  • It’s a windows application with no special dependencies and easy to install
  • It’s got a nice GUI that’s very easy to use
  • It will perform geo-referencing/warping/tile-cutting for you as part of a single process

Mapcruncher is great if you have a relatively small raster image that you want to integrate into Bing Maps / Google Maps as part of a one-off process. However, there are many occasions when you might need a little more than that. As part of a current project, for example, I need to create a tileset that covers an area of approximately 60km x 50km (not that small), based on data held in SQL Server 2008 (not raster), that will be updated approximately every month (not one-off).

So I set out to evaluate alternatives, and the first I considered was mapnik.

Installing Mapnik

Mapnik is an open-source mapping toolkit. It’s what openstreetmap uses to render the tiles used in their base map imagery. They’ve got over 220Gb of XML data in the planet.osm file, covering the whole world, with thousands of updates every day, so if mapnik can render that I’m pretty certain it should be able to cope with my modest requirements.

Mapnik tiles in Open Street Map

Installing and configuring mapnik, unfortunately, turned out to be quite a challenge, and has taken me a significant amount of time to get a working installation. In fact, had I realised quite how many steps would be involved, I’d probably have taken more care to document them carefully. As it is, I’m going to try to note down in this post what I did while they’re still relatively fresh in my mind.

What about using pre-compiled Windows binaries?

Mapnik, like many open source applications, is primarily targeted at a UNIX stack. That means that a large part of the documentation will refer to commands like sudo apt-get, which will look fairly alien to many Windows users. There’s also a lot of dependencies on other packages – python, libboost, libpng, etc. which you may not be familiar with.

Now, fortunately, some kind people have taken the time to prepare pre-compiled Windows binaries of these various tools, and even packaged them together in convenient download format.

For example, the OSGeo4W package contains windows binary executables for Mapnik, along with the GDAL library (used for importing, converting of various spatial data formats), QGIS (open source desktop GIS application), and many other useful open source spatial goodies. The problem is that, with trying to keep track of all those separate dependencies, the package itself can become out-of-date quite quickly. The latest build of OSGeo4W, for example, is still based on python 2.5.2-1. The current build of the 2.x branch of python is 2.7.1, and even that represents the end-of-life release for the 2.x branch. The latest version of python is actually 3.2. I didn’t really want to start a project on software that had already been deprecated.

Likewise, the excellent FWTools project, which also bundles together many of the same packages as OSGeo4W still comes bundled with Python 2.3.4 and version 1.7 of the GDAL library. Crucially, for me, GDAL introduced support for SQL Server as a spatial data source only in version 1.8, so using FWTools wasn’t an option either.

What’s more, Mapnik itself seems to have forked into two versions – the current stable release being 0.7.1, but with many comments being made about breaking changes in the new 2.x development version. The only precompiled windows binaries I could find were of the 0.7 version and, again, I didn’t want to invest a lot of time setting up a project based on software that was about to go out of date.

So, precompiled binaries was, at this stage, a no-go.

Build-It-Yourself

I decided I was going to have to build my own installation, and here was my wish list:

  • Python 3.2
  • GDAL 1.8
  • Mapnik 2.x

Python was (thankfully) easy to install – there’s an installer package available from http://python.org/ftp/python/3.2/python-3.2.msi

GDAL was also not too bad – there are x86 and x64 packages (bundled with mapserver) that you can download from http://www.gisinternals.com/sdk/

Now onto Mapnik. And it is here, with retrospect, that I wish I’d stated taking notes. You can download the source for mapnik from here. However, before compiling mapnik itself, you need to download and/or compile its required dependencies. These are: proj4, boost, zlib, freetype, icu, libxml2, libpng, libjpeg, and libtiff.

Now, you could download the source for each of these and build them separately but fortunately, once again, some kind soul has done much of this work for us. If you go to the gnuwin32 project on sourceforge, you’ll find links to download most of these libraries. For those packages not included in gbuwin32, ICU is available from here, and you can get the latest zlib from here.

Once you’ve got all the dependencies sorted out, it’s onto the configuration changes. If you follow the article at http://trac.mapnik.org/wiki/Python3k you’ll see a number of steps required to rebuild the python bindings with Mapnik to target Python 3.x. After some fiddling about with these and a bit of guesswork, I managed eventually to get everything built.

Testing it Out

Mapnik comes with a test script so, gingerly, I tried running it. Lots of errors – couldn’t find xxx etc. I realised this was because I hadn’t set the environment variables and paths correctly so, after sorting this out, I had another go. This time looked a lot more promising:

image

And, lo and behold, here was the (beautiful) example image generated:

demo_high

Over-confident of my new found ability, I then tried altering the example rundemo.py script to point at a SQL Server datasource. Mapnik supports OGR datasources, so I first created a virtual layer that connected to my SQL Server. For the purposes of testing, I decided to select a set of data from the OS VectorMap District settlement area data (note that I’m not sure if OGR can deal with SQL Server’s native binary geography/geometry format, so I use STAsText() to get the WKT representation and then specify WKT encoding in the GeometryField):


    MSSQL:server=.SQLEXPRESS;database=OSVectorMap;trusted_connection=yes
    SELECT geom27700.STAsText() AS geomWKT FROM TG11_Settlement_Area</pre>

Testing the virtual layer with ogrinfo seemed to suggest that everything was working ok:

image

So then I modifed the python script to add the new layer. Note that OS Vectormap data is defined using the OSGB British National Grid coordinate system (EPSG:27700), and mapnik expects the parameters for the srs property to use PROJ4 syntax, which you can get from http://spatialreference.org/ref/epsg/27700/proj4/:

vectormap_lyr = mapnik.Layer('OS Vectormap')
vectormap_lyr.srs = "+proj=tmerc +lat_0=49 +lon_0=-2 +k=0.9996012717 +x_0=400000 +y_0=-100000 +ellps=airy +datum=OSGB36 +units=m +no_defs"

vectormap_lyr.datasource = mapnik.Ogr(file='MSSQL.ovf',layer="AASQLlayer")</pre>

Unfortunately, here I hit another problem:

image

And it seems here my luck has run out, because I simply can’t work out how to get round this one. The error message is simply “Failed to open datasource”, yet ogrinfo confirms that the datasource is fine, and other GDAL/OGR components can read from it, so I don’t know if it’s because I built mapnik wrong or forgot to change/include a particular setting.

I did just notice that there’s a Google Summer of Code Project to improve Mapnik installation on Windows, and I’m really hoping it’s successful because the results generated by Mapnik are beautiful.

As a workaround, I actually used OGR2OGR to export my data from SQL Server into a shapefile called MSSQL_export.shp, and then used that as a datasource in Mapnik by changing the python datasource to:

<pre>vectormap_lyr.datasource = mapnik.Shapefile(file='MSSQL_export')</pre>

Finally, after a bit of XML styling, I was able to get the following image (click for full size):

image

I’m actually really pleased with the image quality, but until I can get Mapnik to retrieve data directly from SQL Server there’s not much point proceeding with the tilecutting process – I can’t really justify an additional step of exporting from SQL Server to shapefile just to get Mapnik to load it.

If anyone has had any success of getting Mapnik and SQL Server to play nicely together, please let me know!