Deleting Blank Tiles

 

Creating raster tilesets almost invariably leads to the creation of some blank tiles – covering those areas of space where no features were present in the underlying dataset. Depending on the image format you use for your tiles, and the method you used to create them, those “blank” tiles may be pure white, or some other solid colour, or they may have an alpha channel set to be fully transparent.

Here’s an example of a directory of tiles I just created. In this particular z/x directory, more than half the tiles are blank. Windows explorer shows them as black but that’s because it doesn’t process the alpha channel correctly. They are actually all 256px x 256px PNG images, filled with ARGB (0, 0, 0, 0):

image

What to do with these tiles? Well, there’s two schools of thought:

  • The first is that they should be retained. They are, after all, valid image files that can be retrieved and overlaid on the map. Although they aren’t visually perceptible, the very presence of the file demonstrates that the dataset was tested at this location, and confirms that no features exist there. This provides important metadata about the dataset in itself, and confirms the tile generation process was complete. The blank images themselves are generally small, and so storage is not generally an issue.
  • The opposing school of thought is that they should be deleted. It makes no sense to keep multiple copies of exactly the same, blank tile. If a request is received for a tile that is not present in the dataset, the assumption can be made that it contains no data, and a single, generic blank tile can be returned in all such instances – there is no benefit of returning the specific blank tile associated with that tile request. This not only reduces disk space on the tile server itself, but the client needs only cache a single blank tile that can be re-used in all cases where no data is present.

I can see arguments in favour of both sides. But, for my current project, disk and cache space is at a premium, so I decided I wanted to delete any blank tiles from my dataset. To determine which files were blank, I initially thought of testing the filesize of the image. However, even though I knew that every tile was of a fixed dimension (256px x 256px), an empty tile can still vary in filesize according to the compression algorithm used. Then I thought I could loop through each pixel in the image and use GetPixel() to retrieve the data to see whether the entire image was the same colour, but it turns out that GetPixel() is slooooowwwww….

The best solution I’ve found is to use an unsafe method, BitMap.LockBits to provide direct access to the pixel byte data of the image, and then read and compare the byte values directly. In my case, my image tiles are 32bit PNG files, which use 4 bytes per pixel (BGRA), and my “blank” tiles are completely transparent (Alpha channel = 0). Therefore, in my case I used the following function, which returns true if all the pixels in the image are completely transparent, or false otherwise:

public static Boolean IsEmpty(string imageFileName)
{
  using (Bitmap b = ReadFileAsImage(imageFileName))
  {
    System.Drawing.Imaging.BitmapData bmData = b.LockBits(new Rectangle(0, 0, b.Width, b.Height), System.Drawing.Imaging.ImageLockMode.ReadOnly, b.PixelFormat);
    unsafe
    {
      int PixelSize = 4; // Assume 4Bytes per pixel ARGB
      for (int y = 0; y < b.Height; y++)
      {
        byte* p = (byte*)bmData.Scan0 + (y * bmData.Stride);
        for (int x = 0; x < b.Width; x++)         {           byte blue = p[x * PixelSize]; // Blue value. Just in case needed later           byte green = p[x * PixelSize + 1]; // Green. Ditto           byte red = p[x * PixelSize + 2]; // Red. Ditto           byte alpha = p[x * PixelSize + 3];           if (alpha > 0) return false;
        }
      }
    }
    b.UnlockBits(bmData);
  }

  return true;
}

 

It needs to be compiled with the /unsafe option (well, it did say in the title that this post was dangerous!). Then, I just walked through the directory structure of my tile images, passing each file into this function and deleting those where IsEmpty() returned true.

 

The DEM Shaded Overlays

 

The default Bing Maps road style uses a “hillshade” effect to give an impression of underlying terrain. It’s a relatively subtle, but surprisingly powerful technique to enhance the appearance of map layers, as demonstrated by comparing the following two tiles:

image
Without hillshading
image
With hillshading

In this post, I’ll describe how to create your own hillshade overlay from digital elevation model (DEM) data, using the GDAL toolset.

By creating the overlay as a set of semi-transparent tiles, rather than pre-rendered into the tiles as shown above, you can place them on top of any Bing Maps/Google Maps et al. tilelayer to represent the underlying terrain.

The process I’ve followed is based on the work of others, most notably PerryGeo, and you can find some other guides on the internet to achieve this same effect. However, I found some of the existing guides on the subject to be either out-of-date or require knowledge of Linux BASH commands etc., so I hope that some of you will find this new step-by-step guide helpful.

1.) Acquire a DEM terrain model

To start with, you’re going to need some source data about the underlying terrain of the earth from which to calculate your hillshade. There’s lots of places to acquire this data from; Perhaps the easiest to use (assuming you’ve got Google Earth installed) is to open the kmz file available from http://www.ambiotek.com/topoview. This uses Google Earth as a graphical interface for v4.1 of the  elevation dataset gathered by the Shuttle Radar Topography Mission (SRTM), from which you can click to download individual DEM tiles covering 5°x 5°, as shown below:

Alternatively, you can access these files directly from the KCL server (my former university, incidentally) at http://srtm.geog.kcl.ac.uk/portal/srtm41/srtm_data_geotiff/

The data is provided in GeoTIFF format. You can load one of these tiles up in any graphics program that can load TIFF files, but it won’t look very interesting yet. The height information is encoded in additional metadata that will be ignored by normal graphics programs, so you’ll probably just get an image like this (this is srtm_36_02.tif):

Black parts show the presence of data in the underlying file, which we’ll subsequently process using GDAL tools to create shaded images.

2.) Reproject to Spherical Mercator

Most DEM data sources, including the SRTM data I linked to above, are provided in Plate Carree projection – i.e. WGS84 coordinates of longitude are mapped directly to the x axis of the image, while latitude is mapped directly to the y axis. Before we create tiles from this data suitable for overlay on Bing Maps, Google Maps, et al. we therefore need to transform it into the Spherical Mercator projection. You can do this using gdalwarp, as follows:

gdalwarp -dstnodata 0 -tr 305.7481 305.7481 -multi -co "TILED=YES" -t_srs EPSG:3857 srtm_36_02.tif srtm_36_02_warped.tif

The full list of parameters accepted by gdalwarp are listed here,  but the options I set are as follows:

  • dstnodata states what value to use to represent nodata values (the equivalent of null in a SQL database, for example). I’ve set a value of 0 (i.e. black).
  • tr gives the target resolution in x and y dimensions. The SRTM data I’m using was recorded at 90m resolution, so you might think that this should be set to 90 90. However, I’m going to be using this data for display on Bing Maps at different zoom levels, which will necessarily involve resampling the image.  Therefore, you should set this value to the resolution (in metres/pixel) of the maximum zoom level on which you plan to overlay your data. (Remember that maximum zoom level will have the smallest resolution). You can obtain this value from my Bing Maps Ready Reckoner. In the case above, I’m planning overlaying my data on Zoom Level 9 and above, so I set a value of 305.7481 (in both dimensions). If I’d wanted to go to Zoom Level 10, I would have decreased this to 152.87 instead.
  • multi allows parallel processing
  • co “TILED=YES” is a format-specific option that states that the output TIFF file should be tiled rather than stripped (see http://www.fileformat.info/format/tiff/egff.htm for an explanation of the difference)
  • t_srs gives the destination spatial reference system into which the image should be reprojected. In this case, EPSG:3857, as used by Bing Maps, Google Maps etc.

The resulting image, srtm_36_02_warped.tif, will still be a GeoTIFF file, but will now be projected as follows. The height and width of the output image will depend on the target resolution you specified in the tr parameter:

 

3.) Convert from DEM to Hillshade

The warped GeoTIFF file has height data encoded in it, but we want to translate that information into a visible shaded effect, and for this we can use gdaldem.

gdaldem actually provides several interesting functions related to working with DEM data, including the ability to derive contour lines, and create shaded relief maps. Maybe I’ll write about these another time, but for this example we want to use the hillshade mode. You can shade the warped image created in the previous step as follows:

gdaldem hillshade srtm_36_02_warped.tif srtm_36_02_warped_hillshade.tif -z 2 -co "TFW=YES"

This time, I’m only supplying two additional parameters:

  • z is a scaling factor applied to the generated hillside image that accentuates the hills, increasing the contrast of the image. I provided a value of 2 just to enhance the effect a bit, but you might decide you don’t need this.
  • co “TFW=YES” specifies that the output image should be created with an accompanying “world file”. This is a simple ASCII text file that provides additional information about the geographic extents of the created image, which we’ll need to use in a later step to line the hillshade image up with the Bing Maps tiling system. You can look up more information about world files on wikipedia.

There are additional parameters that allow you to specify the direction and the angle of the light source from which the simulated shadows will be created.

The result of executing the above code will be another TIFF file, in which the background is black, and the elevation data from the DEM has been converted into shades of grey, as follows:

 

At this stage, you could stop if you wanted to, and simply create a tile layer from the hillshaded image, which would look a bit like this:

 

Which makes the landscape of North Wales look a bit like the Moon, I think…

To make the data slightly more usable, we need to carry on with a few more tweaks.

4.) Making a Semi-Transparent Overlay

Currently, our hillshade image is opaque, with the shadows cast by terrain represented by variations in brightness of the colour used. To make this into an re-usable overlay that can be used on top of other layers, we need to make the image semitransparent, with shadows cast by terrain being represented by variations in opacity instead.

There are several ways of modifying the image data to achieve this effect. You could do it in Photoshop or another graphics program, for example, or using the graphics libraries in C# or PHP. Since I’m currently trying to learn Python, and GDAL is quite closely linked with Python, I’ll try to do it using the Python Imaging Library instead.

The following Python script makes a number of tweaks to the image above. Firstly, it converts it to a pure greyscale image (while the image above looks greyscale, it’s actually using a colour palette). It then inverts the image, turning it into a negative image. The reason for the inversion is that we then copy the (single) channel of the greyscale image into the opacity channel of a new RGBA image – areas that were very light in the source want to have very low opacity in the transparent image, and vice-versa, so the channel needs to be inverted.

Finally, we scan through the data to find instances of pixels that are pure black (RGBA value 0, 0, 0, 255) –this was the nodata value we set in step one – and replace them with pure transparent pixels (0, 0, 0, 0). The alpha channel in the tuples of any other pixels is also lightened slightly – I chose a value of 74 somewhat arbitrarily because I thought the resulting image looked good – you can choose whatever value you want, or none at all.
[php]
from PIL import Image as PImage
from PIL import ImageOps

# Load the source file
src = PImage.open("srtm_36_02_warped_hillshade.tif")

# Convert to single channel
grey = ImageOps.grayscale(src)

# Make negative image
neg = ImageOps.invert(grey)

# Split channels
bands = neg.split()

# Create a new (black) image
black = PImage.new(‘RGBA’, src.size)

# Copy inverted source into alpha channel of black image
black.putalpha(bands[0])

# Return a pixel access object that can be used to read and modify pixels
pixdata = black.load()

# Loop through image data
for y in xrange(black.size[1]):
for x in xrange(black.size[0]):
# Replace black pixels with pure transparent
if pixdata[x, y] == (0, 0, 0, 255):
pixdata[x, y] = (0, 0, 0, 0)
# Lighten pixels slightly
else:
a = pixdata[x, y]
pixdata[x, y] = a[:-1] + (a[-1]-74,)

# Save as PNG
black.save("srtm_36_02_warped_hillshade_alpha.png", "png")
[/php]
(Much of the logic in this script came from here). The resulting image will be a PNG file, in which darker shadows are represented by increasingly opaque black parts, while lighter shadows are more transparent:

 

SQL Azure – Moving your data

 

Moving your data to the (public) cloud necessarily involves relinquishing some control over the setup and maintenance of the environment in which your data is hosted. Cloud-based hosting services such as Microsoft Azure are effectively just scalable shared hosting providers. Since parts of the server configuration are shared with other customers and (to make the service scalable) there is to be a standard template on which all instances are based, there are many system settings that your cloud provider won’t allow you to change on an individual basis.

For me, this is generally great. I’m not a DBA or SysAdmin and I have no interest in maintaining an OS, tweaking server configuration settings, installing updates, or patching hotfixes. The thought of delegating the tasks to ensure my server remains finely-oiled and up-to-date to Microsoft is very appealing.

However, this also has its own down-sides. One advantage of maintaining my own server is that, even though it might not be up-to-date or have the latest service packs applied, I know nobody else has tweaked it either. That means that, unless I’ve accidentally cocked something else up or sneezed on the delete key or something, a database-driven application that connects to my own hosted database should stay working day after day. When an upgrade is available I can choose when to apply it, and test to ensure that my applications work correctly following the upgrade according to my own plan.

Not so with SQL Azure.

Two examples of breaking changes I’ve recently experienced with SQL Azure, both seemingly as a result of changes rolled out since the July Service Release:

Firstly, if you use SQL Server Management Studio to connect and manage your SQL Azure databases, you need to upgrade SSMS to at least version 10.50.1777.0 in order to connect to an upgraded SQL Azure datacentre. This same change also broke any applications that rely on SQL Server Management Objects (including, for example, the SQL Azure Migration Wizard, resulting in the error described here). The solution to both these issues is thankfully relatively simple once diagnosed – run Windows Update and install the optional SQL Server 2008 SP1 service pack.

A more subtle change is that the behaviour of the actual SQL Azure database engine has changed, making it more comparable to Denali on-site SQL Server rather than SQL Server 2008 R2. Whereas, normally, upgrading SQL Server wouldn’t be a breaking change for most code (unless, of course, you were relying on a deprecated feature that was removed), the increase in spatial precision from 27bits to 48bits in SQL Denali means that you actually get different results from the same spatial query. Consider the following simple query:

DECLARE @line1 geometry = 'LINESTRING(0 11, 430 310)';
DECLARE @line2 geometry = 'LINESTRING(0 500, 650 0)';

SELECT @line1.STIntersection(@line2).ToString();

Previously, if you’d have run this query in SQL Azure you’d have got the same result as in SQL Server 2008/R2, which is POINT (333.88420666910952 243.16599486991572).

But then, overnight, SQL Azure is upgraded and running the same query now gives you this instead: POINT (333.88420666911088 243.16599486991646), which is consistent with the result from SQL Denali CTP3.

Not much of a difference, you might think… but think about what this means for any spatial queries that rely on exact comparison between points. How about this example using the same two geometry instances:

SELECT @line1.STIntersection(@line2).STIntersects(@line1);

SQL Azure query run in July 2011: 0. Same SQL Azure query run in August 2011: 1. Considering STIntersects() returns a Boolean, you can’t really get much more different than 1 and 0….

So, a precautionary tale: although SQL Azure hosting might have handed over the responsibility for actually performing any DB upgrades to Microsoft, the task of testing and ensuring that your code is up-to-date and doesn’t break from version to version is perhaps greater than ever, since there is no way to roll back or delay the upgrade to your little slice of the cloud.