How to make maps with Fusion Tables

Inspired by the Guardian Data Blog I decided to explore Fusion Tables and Google Maps with Australian data. To start with, I selected a set of Socio-Economic Indexes for Areas, created from 2006 Census data, and postal area boundaries from the Australian Bureau of Statistics (ABS).Today I would like to share a few observations regarding creating maps with data from Fusion Tables.

Although Fusion Tables is not yet a fully featured thematic mapping, analysis and publishing application however, with a little bit of effort, anyone can create informative maps which are visually attractive and fast to deploy. The best thing about Fusion Tables is that you don’t need to manage any complex infrastructure yourself and that the application is free (however, with some limitations on data storage volumes, currently capped at 250MB per account).

Spatial Data

Since Fusion Tables support spatial data only in KML format you have to convert your dataset before uploading it to the server, or alternatively, find a publicly available table that has already been uploaded by someone else.

Google provides a tool to translate SHP data into KML format and to import directly into Fusion Tables but it didn’t work for me with complex data. There are some free alternatives available (really easy to use one is QGIS, for example) but loading other than KML data into Fusion Tables will always be a multi step process.

If you decide to upload your own data, please note a couple of annoying limitations of Fusion Tables. Firstly, complex polygon structures are not supported (for example, I could not upload postcode number 0822 in Northern Territory at full resolution, yet it works perfectly with Google Maps). Secondly, some larger polygons and/or with many parts get generalised automatically as you load them to Fusion Tables as, for example, postal area 7255 in Tasmania (compare the results below – the same KML file as imported to Fusion Tables, on the left, and as displayed directly on Google Map – note green outlines on all, even the smallest islands):


Table search functionality in Fusion Tables is rather crude so, it may not be an easy task to locate what you are looking for. Not to mention that the concept of metadata is non-existent in Fusion Tables so, it is hard to know if the data you find is appropriate for your purposes.

Numeric data

Upload of tabular numeric information in csv format is very straightforward but if you disallow “Export” option up front, you will not be able to edit the data in Fusion Tables. My suggestion is to import the data as “Private” (default option) and allow for “Export”, then add new columns with formulas (if required), and disallow export only when you are ready to publish the data (if at all).

Table Operations

You can easily create a map based on data from numeric tables if those tables contain a “spatial reference” column, for example, postcode numbers (provided you can find equivalent spatial data set in Fusion Tables). To combine numeric and spatial data tables you have to use “Merge” function. My suggestion is to use “smaller table” as a starting point. For example, to create thematic map with postcodes for Sydney area only, select relevant numeric table first and then merge with a table containing postal areas for the entire NSW. Only relevant boundaries will be included in the merged table (ie. the subset of NSW postcodes). If you do the operation in the reverse order, the merged table will contain all postcodes for NSW but only a handful will have the data that can be used in creating a thematic map.

When you “Create View” (ie. copy the table – your own or from other users to your account) or “Merge” tables with spatial geometry column you will lose map formatting parameters (eg. colour setting for polygon fills, etc.). This is very unfortunate, especially when you need to retain colour schema from the original table.

Styling Map

Handling “No data” fields is not easy in Fusion Tables. The problem is that polygons with “no value” in the table default to red fill when rendered on the map (as in the example below – there was no data for 2006 postcode in the merged numeric table). A workaround is to include some value in the table for the missing record (eg. traditional -9999) if you can. Then you can specify map settings to colour only that value, for example, as white and/or fully transparent.


Fully transparent overlays (eg. if fill is set to 0% transparency) are not clickable – it is a very handy feature for handling polygons with missing data in the numeric table (ie. no information window will be displayed when the polygon is clicked). However, when your objective is to present on the map only outlines of the polygons but you still want to display information about those polygons on click of the map, you have to change transparency parameter to a value greater than 0.

Tutorials

If you are eager to start playing with Fusion Tables, Google produced easy to follow tutorial on how to create thematic maps (note, if you are working with your own data, choose “Map” option and not “Intensity Map” in the relevant step).

Beautiful & interactive maps faster with the Fusion Tables API

Google Fusion Tables is a modern data management and publishing web application that makes it easy to host, manage, collaborate on, visualize, and publish data tables online. Since we first launched Fusion Tables almost two years ago, we’ve seen tremendous interest and usage from dozens of areas, from journalists to scientists to open-data entrepreneurs, and have been excited to see the innovative applications that our users have been able to rapidly build and publish.

We’ve been working hard to enrich what Fusion Tables offers for customization and control of visual presentation. This past fall we added the ability to style the colors and icons of mapped data with a few clicks in the Fusion Tables web app. This spring we made it easy to use HTML and customize what users see in the info window that appears after a click on the map. We’ve enjoyed seeing the impressive visualizations you have created. Some, like the Guardian’s map of deprivation in the UK, were created strictly within the web app, while apps like the Bay Citizen’s Bike Accident tracker and the Texas Tribune’s Census 2010 interactive map take advantage of the Fusion Tables SQL API to do even more customization.


Of course, it’s not always convenient to do everything through a web interface, and today we’re delighted to invite trusted testers to try out the new Fusion Tables Styling and Info Window API. Now developers will be able to set a table’s map colors and info windows with code.

Even better, this new Styling and Info Window API will be part of the Google APIs Console. The Google APIs Console helps you manage projects and teams, provision access quotas, and view analytics and metrics on your API usage. It also offers sample code that supports the OAuth 2.0 client key management flow you need to build secure apps for your users.

So if you’ve been looking for a way to programmatically create highly-customizable map visualizations from data tables, check out our new APIs and let us know what you think! To become a trusted tester, please apply to join the Google Group and tell us a little bit about how you use the Fusion Tables API.

Australian Postcodes User Guide

There is a significant level of interest in postcodes as a convenient reference to locations because of perceived ease of linking them to information about individuals and businesses alike. Over the years postcodes have been put to a wide range of uses in analysing and publishing social trends and population statistics as well as in defining sales, service, franchise or dealership areas. Unfortunately, a misunderstanding of what postcode really is, resulting from a widely held belief about its value as a uniform referencing system, can cause many troubles for the unwary users. This article is a guide for all potential users of postcode boundary data.

Postcode Basics

Firstly, some facts about postcodes, from Australia Post site:

  • Postcodes were introduced in 1967 to facilitate the efficient processing and delivery of mail to customers.
  • Postcodes are only allocated to localities officially gazetted by State land agencies (usually, a postcode covers an area comprising of more than one locality).
  • The decision as to whether a new postcode or an existing postcode is to be allocated to a locality is based on operational efficiency.
  • Because the adoption of new or changed postcodes by customers is slow, changes are only made where significant reasons for change are established. A postcode change will only be considered if such a change leads to either enhanced service to Australia Post customers or operational efficiency to the organisation. Any such change will involve consultation with the local council/shire and residents.

Please note, the above holds true most of the time… but there are exceptions. It is also important to note that there are 3 types of postcodes: delivery areas, post office boxes and large volume receiver. Only delivery areas have meaningful reference to locations “on the ground”.

Sources of Postcode Information

Australia Post publishes a list of all postcodes from its database as a comma delimited text file. The list is updated every month and can be downloaded for free from Australia Post website.

Australian Bureau of Statistics publishes in 5 year intervals a set of Postal Area boundaries that are compiled using outlines of Census Collection Districts. They approximate official Australia Post postcode coverage areas at the time of publishing. These boundaries are available for free download in a range of popular GIS data formats. The next update of the data will be released in December 2010.

A number of private companies also produce and regularly update their own versions of postcode boundaries which are available for purchase. The two major suppliers include MapData Sciences (currently ESRI Australia) and Pitney Bowes (formerly MapInfo Australia). There is also a number of smaller operators that may be a source of free or inexpensive information on postcodes, such as aus-emaps.com which genaralises and converts ABS postcode boundaries to KML format for use with Google Map and Google Earth and supplies large format static maps in PDF format for printing. Other small suppliers with a variety of postcode related products and maps include: mapmakers.com.au, findmap.com.au, cartodraft.com.au and ausmaps.com.

Common Problems with Postcodes

1. Changing Postcodes
Postcodes are changing over time due to evolving operational requirements of Australia Post. Changes include additions of new postcode numbers and deletions of old ones from the list as well as adjustments to composition of postcodes by adding or removing localities. This is especially the case with new, dynamically growing areas as well as some rural locations and is less of the issue for established metropolitan areas.

It means that postcodes are not a stable spatial reference. It is ok to use them as a snapshot of a particular point in time, but what often happens is that the attribution to “what area constituted that postcode X years ago” is lost from the supporting documentation and important facts can be misinterpreted by future users of the information.

This is a real problem for researchers of social trends – those who insist on using postcodes as the main location reference. As well, it may cause some legal headaches if postcodes are referenced in contracts for supply of services or franchise areas, etc. Postcodes were never meant to be used in this fashion!

2. Changing definitions of localities
On top of changes that are undertaken from time to time by Australia Post, there are also changes to boundaries defining localities which are implemented by State and local authorities. What was locality X in 2007 may now be split into locality X and Y. As the result, it is very difficult to maintain timely and consistent reference of postcode numbers to “what is actually on the ground”.

3. Imperfect procedures of referencing postcodes to localities
Where possible, Australia Post references postcodes to officially gazetted localities but localities are determined by State land agencies and boundaries are recommended by local councils. This process is not coordinated from end to end and sometimes it gets out of sync. Take for example postcode 3478 in Victoria. Australia Post lists Medlyn as a locality included in this postcode (June 2010 edition) yet this locality is not on Victoria’s register of gazetted locations. Referencing postcode numbers to localities is not a science and there can be inconsistencies.

Recommendations

If you must use postcodes, please consider the limitations outlined earlier as well as the following recommendations to avoid potential problems:

If you intend to match postcodes to official ABS statistics:

  • Your only choice is ABS version of postcodes as it will ensure consistency of definitions (that is, postcode X in the data table will correspond to postcode X depicted as an outline on the map). It is particularly relevant for Census of Population and Housing data.
  • If you need to combine those statistics with your own data (eg. client records), geocode individual addresses and then reference them to specific postcode boundaries (eg. using GIS software with “intersect” function capabilities) rather than just rely on postcode component of the address to match the records to boundaries. It is the only way to ensure a particular address/location is part of that specific postcode area.

If you intend to use postcode outlines to define custom areas:

  • Again, ABS version of postcodes is the most cost effective option as it is a free dataset.
  • Define your custom areas once and put effort in maintenance of that dataset over time. You can adjust a composition of custom areas if required (eg. add/ subtract postcodes or even adjust boundaries – but only if topological consistency can be maintained – that is, if changes to the boundary of one polygon can be reflected in the adjoining polygons).
  • It is important to acknowledge that this dataset becomes de facto your own version and that compatibility with “source” postcodes and/ or statistics published on postal area basis may be lost over time.
  • Always reference version of postcodes used in any legal documents to avoid future ambiguity as to what constituted “that” postcode at “this” particular point in time.
  • As in the previous case, if you need to reference those postcode outlines to your own data, run geocoding and then reference individual records to specific boundaries and do not rely on postcode details in the address record alone to match data with boundaries.

If you are relying on postcode boundaries from commercial operators:

  • There is really no point in aiming to always have “the latest” version of boundaries representing postcodes. After all, these are not compatible with ABS statistics (unless the company can assure they reprocess those stats “somehow” to a new representation of boundaries) and besides, what is the benefit of constantly having to reprocess your own data to accurately reference it to the ever changing representation of postcode boundaries? The only exception would be if the company supplies some other unique data that is available exclusively with their proprietary version of boundaries.
  • Although companies claim to have “the latest”, these data are rarely updated on continuous basis (ie. every month), rather in 3 or 6 monthly intervals so, you are still getting “dated” product.
  • Don’t assume you will be able to reference your address records to “the latest boundaries” using only postcode number unless your address details and postcode boundaries refer to the same time period. In most cases they don’t and you cannot avoid geocoding and then running GIS “intersect” processing of data to ensure reliability of information.

In conclusion, although postcodes appear to be well recognised spatial units for referencing locations, the complexity associated with accurate delineation of postal boundaries greatly diminishes their usefulness. If you can, avoid using postcodes! If you can’t, be aware of all the limitations, especially when drawing conclusions with far reaching consequences.