The SqlGeometry with Microsoft SQL Server

I came across a curious error earlier today when attempting to use a SqlDataReader to read a column of geometry data from a SQL Server table:

System.InvalidCastException: Unable to cast object of type ‘Microsoft.SqlServer.Types.SqlGeometry’ to type ‘Microsoft.SqlServer.Types.SqlGeometry’

SqlGeometry to SqlGeometry… you’d think that would be a pretty easy cast, wouldn’t you? It turns out that this is a problem caused by a conflict between the spatial libraries used in SQL Server Denali compared to that in 2008/R2, and you’ll get this error depending on which version of Microsoft.SqlServer.Types.dll you use, and how you try to access geometry or geography columns from a datareader:

[php]
while (dataReader.Read())
{
// This works in SQL Server 2008/R2, but errors with Denali
SqlGeometry g = (SqlGeometry)dataReader.GetValue(0);

// This works in SQL Server 2008/R2, but errors with Denali
SqlGeometry g = (SqlGeometry)dataReader["GeomCol"];

// This works in Denali, but not in SQL Server 2008/R2
SqlGeometry g = SqlGeometry.Deserialize(reader.GetSqlBytes(0));

// This works in Sql Server 2008/R2/Denali
SqlGeometry g = new SqlGeometry();
g.Read(new BinaryReader(reader.GetSqlBytes(0).Stream));
}
[/php]

After a bit of digging around, it appears that using GetValue or square brackets notation [] to access a geometry/geography field in a SqlDataReader is hard-coded to load the 10.0 (SQL Server 2008) version of the Microsoft.SqlServer.Types library.

If you’ve got side-by-side installations of both SQL Server 2008/R2 and Denali (as I have), and try to reference the 11.0 (Denali) version of Microsoft.SqlServer.Types, you’ll therefore get an assembly mismatch when both versions of the library are loaded, which causes the slightly unhelpful error listed at the top of this post. Even if you’ve only got Denali installed, your code may still try to reference a (non-existent) 2008/R2 version of the Microsoft.SqlServer.Types.dll library, so you’ll get a different error instead:

Could not load file or assembly ‘Microsoft.SqlServer.Types, Version=10.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91′ or one of its dependencies. The system cannot find the file specified.

The simplest way to resolve these errors is by changing the way you reference any geography/geometry (and, I imagine, hierarchyid) columns from your DataReader, as in the code example above. Alternatively, you can set up an assembly redirection in the application configuration file as explained here (about halfway down), which will allow you to correctly target the Denali version.

As per the What’s new in SQL Server Denali whitepaper, “… side-by-side installations of SQL Server Code-Named “Denali” CTP1 and CTP3 are not supported with existing SQL Server 2008 installations …”, so perhaps I only have myself to blame for this. Interestingly though, the person who raised this MS Connect issue, says that they have experienced exactly the same problem on a clean install of Denali CTP3. The response from Microsoft suggests that this may be due to older versions of the library being packaged with Visual Studio 2010, and also confirms that the problem will not be resolved prior to RTM of SQL Server Denali.

Strangely, I encountered another curious error a few months ago concerning version conflicts of Microsoft.SqlServer.Types. My CTP3 Management Studio Spatial Results tab does not plot curved geometries (selecting a CircularString or the result of BufferWithCurves etc. just produces a blank pane). I had originally assumed that, since this was only a CTP release, this feature had simply not been added yet. It turns out that curved geometries are supported in SSMS CTP3 Spatial Results tab but, if you have side-by-side SQL Server 2008 and Denali, this can corrupt this feature. I guess the reason is similar – that SSMS is somehow attempting to load the SQL Server 2008/R2 version of Microsoft.SqlServer.Types, which, of course, doesn’t support curved geometries.

Fusion Tables is ready to blow

Google is making consistent but slow progress with Fusion Tables, gradually enabling various functionality options to turn the application into a comprehensive data visualisation and sharing package. The idea behind Fusion Tables is simple – allow people to upload data in a tabular format, then present that data with graphs or geocode/ match to spatial data and display on Google maps as thematic overlays or location points. Undoubtedly, the integration of tables, maps and graphs is Google’s response to emerging trend for “data marts” and “data journalism”.

Fusion Tables has a potential to evolve into a formidable competitor to PostGIS, ArcSDE , Oracle Spatial or SQL Server for basic GIS applications. Although a recent addition of dynamic styling capabilities takes Fusion Tables closer to that goal, it is still a long way for the application to reach that point. Unfortunately, the implementation of Fusion Tables is in typical, of late, Google fashion – unattractive and rather complex to follow so, most likely only “hard core” developer community will be taking advantage of it. The limit of 250MB of data per account is not helping either. There is no catalogue of available data (although basic text search is enabled) and no metadata for public tables so, it will not facilitate sharing.

Nevertheless, you can already make nice and very responsive maps with Fusion Tables, as in this example from Guardian’s Data Blog:

Handling Multiple Spatial Columns

Some spatial data formats can store only a single logical set of features. Any given ESRI shapefile, for example, can store only POINTs, LINESTRINGs, or POLYGONs – not a mixture of each type of geometry. What’s more, all of those geometries must all be defined using the same spatial reference system, as defined in the associated .PRJ file.

Some other formats are more flexible. A SQL Server table, for example, can have many separate geometry or geography columns, and the values can contain both a mixture of geometry types, and even different spatial reference systems within the same column. I frequently use multiple spatial columns as a way of storing pre-calculated geometries relating to each feature at different levels of approximation, or different coordinate systems. For example, suppose that I have a table of features that were originally supplied in EPSG:27700 (OS National Grid of Great Britain). In addition to keeping the original geometry, I re-project those features into EPSG:4326 (WGS84) for displaying on Bing Maps. I also frequently simplify those features (using Reduce()) to create simpler approximations for display at far zoom levels. I keep each of these modified versions of the geometry in separate columns of the same table, named according to the datatype, SRID, and any modifications that have been performed, such as this:

 

CREATE TABLE (
  FeatureID int,
  geom27700 geometry,
  geog4326 geography,
  geog4326_reduce1000 geography
);

 

I came across a situation earlier today trying out Safe FME against a table with this structure that held me up for a few hours, so I thought I’d write about it here.

If you create a Feature Reader from a SQL Server table that has only a single geometry or geography column (as, I suppose, is probably the norm), you don’t need to explicitly specify the name of the column, as Safe FME is intelligent enough to work it out for you. Looking in the log, you’ll see a line that says

MS SQL Server (Spatial) Reader: Spatial column not specified for Feature Type 'dbo.Mastermap'.
Using Column 'GEOMETRY' with type 'geometry'

 

If you’ve only ever used tables with one spatial column, you’ve probably not even noticed this as it shoots past in the log screen – you just drag your MS SQL Feature Reader onto the workspace and use it just as you would use a ESRI shapefile reader, or any other single data source. FME loads the values in the spatial column as features, and the values of all other columns of the table as attributes of those features.

But what if, like me, your source table has more than one spatial column? How do you specify which geometry/geography column should be used in the MSSQL reader to populate the feature type? Well, if you go to the parameters tab of the feature type you’ll see a couple of options:

  • “Specify Geography (geodetic) Column” – Yes if you’re using the geography datatype, or No if you’re using a column of the geometry datatype.
  • Geometry Column Name / Geography Column Name – The name of the column to retrieve.

In my dataset, the source column was of the geometry datatype and named Geom27700, so I set the parameters in my reader feature type as follows:

clip_image001[7]

I then carried on with building my (rather simple) workflow, as follows – check for valid data, reproject to WGS84, orient polygons using the left-hand rule, and then output to a writer:

image

Rather than insert the records into a new dataset, the writer in this case was designed to update the same table from which I’d read the features, but populate the geog4326 column (remember that features had been read from the geom27700 column).

So, I set the Writer Mode to UDPATE, selected the FEATURE_ID (the primary key of the table) as the key on which features would be matched, and set the other  parameters as follows:

Then, I clicked execute.

10 hours later, and FME was reporting that the transformation had been a success, with 48 million rows loaded into the destination table. But when I came to look at the table in SQL Server Management Studio, the geog4326 column that should have been populated by my writer was empty. Blank. NULL. Zilch. Zip.

So, what had gone wrong? I went back and looked through the FME log – everything seemed sound. I tried searching the internet and found some old posts about whether Safe FME supported multiple spatial columns per feature (like this one), so wondered if this was a limitation of the software, but nothing concrete came up. So then I contacted Safe FME’s “Doctors” over twitter. They asked me to send them a simple repro script, so I ditched my original package and started again. I built up a simple script, this time just looking at a test table with 100 rows, and…. it worked.

Much head-scratching later and I found the reason – when you specify the Geometry Column Name or Geography Column Name in a Feature Reader, the entered column is CASE-SENSITIVE, and must match exactly the definition of that column in the database. Frustratingly, there is no drop-down list of columns from which to choose this value – it must be manually typed. What’s more, if you enter the correct column name but using the wrong case, no error will be thrown. The package will run quite happily, in fact, loading all the attribute values from the table, but not the spatial data from the geometry/geography column itself. So your transformation will still load ‘000s of rows from the reader, but not with the (generally rather crucial) spatial feature itself.

For some reason, in this particular case, I’d named the column from which I was reading data GEOM27700 rather than geom27700. This is why, in my earlier example, the geog4326 column was NULL – it’s not that my UPDATE query hadn’t executed – the column had in fact been updated, but with NULL values that had been carried right the way through the FME transformation, since the reader had never successfully retrieved any values from the geom27700 column (since it was actually called GEOM27700).

I found this slightly odd behaviour, since my SQL Server collation itself is not case-sensitive, and will quite happily accept either GEOM27700 or geom27700 in a SELECT statement. What’s also odd is that this same behaviour is not mirrored in the Safe FME MS SQL Writer. In the writer, case sensitivity of the spatial column name is still significant, but if you attempt to specify a column with the wrong case, an error is thrown straight away and you’ll be unable to run the transformation, rather than the “let you carry on but don’t read anything” approach of the reader.

So, a slightly frustrating experience, and 10hrs of processing spent updating a table with a load of NULLs, but never mind – fortunately I’m only still learning Safe FME and this was just an experiment, so no angry clients ringing up asking where their data is (this time)

And, to give credit where it’s due, having found the problem myself I then contacted the FME doctors again over twitter and they immediately replied, acknowledging the problem and informing that an update was in the works. I can think of a few software companies who could learn a thing or two about customer service like that….

by Alastair “Bing” Aitchison