SQL Azure – Moving your data

 

Moving your data to the (public) cloud necessarily involves relinquishing some control over the setup and maintenance of the environment in which your data is hosted. Cloud-based hosting services such as Microsoft Azure are effectively just scalable shared hosting providers. Since parts of the server configuration are shared with other customers and (to make the service scalable) there is to be a standard template on which all instances are based, there are many system settings that your cloud provider won’t allow you to change on an individual basis.

For me, this is generally great. I’m not a DBA or SysAdmin and I have no interest in maintaining an OS, tweaking server configuration settings, installing updates, or patching hotfixes. The thought of delegating the tasks to ensure my server remains finely-oiled and up-to-date to Microsoft is very appealing.

However, this also has its own down-sides. One advantage of maintaining my own server is that, even though it might not be up-to-date or have the latest service packs applied, I know nobody else has tweaked it either. That means that, unless I’ve accidentally cocked something else up or sneezed on the delete key or something, a database-driven application that connects to my own hosted database should stay working day after day. When an upgrade is available I can choose when to apply it, and test to ensure that my applications work correctly following the upgrade according to my own plan.

Not so with SQL Azure.

Two examples of breaking changes I’ve recently experienced with SQL Azure, both seemingly as a result of changes rolled out since the July Service Release:

Firstly, if you use SQL Server Management Studio to connect and manage your SQL Azure databases, you need to upgrade SSMS to at least version 10.50.1777.0 in order to connect to an upgraded SQL Azure datacentre. This same change also broke any applications that rely on SQL Server Management Objects (including, for example, the SQL Azure Migration Wizard, resulting in the error described here). The solution to both these issues is thankfully relatively simple once diagnosed – run Windows Update and install the optional SQL Server 2008 SP1 service pack.

A more subtle change is that the behaviour of the actual SQL Azure database engine has changed, making it more comparable to Denali on-site SQL Server rather than SQL Server 2008 R2. Whereas, normally, upgrading SQL Server wouldn’t be a breaking change for most code (unless, of course, you were relying on a deprecated feature that was removed), the increase in spatial precision from 27bits to 48bits in SQL Denali means that you actually get different results from the same spatial query. Consider the following simple query:

DECLARE @line1 geometry = 'LINESTRING(0 11, 430 310)';
DECLARE @line2 geometry = 'LINESTRING(0 500, 650 0)';

SELECT @line1.STIntersection(@line2).ToString();

Previously, if you’d have run this query in SQL Azure you’d have got the same result as in SQL Server 2008/R2, which is POINT (333.88420666910952 243.16599486991572).

But then, overnight, SQL Azure is upgraded and running the same query now gives you this instead: POINT (333.88420666911088 243.16599486991646), which is consistent with the result from SQL Denali CTP3.

Not much of a difference, you might think… but think about what this means for any spatial queries that rely on exact comparison between points. How about this example using the same two geometry instances:

SELECT @line1.STIntersection(@line2).STIntersects(@line1);

SQL Azure query run in July 2011: 0. Same SQL Azure query run in August 2011: 1. Considering STIntersects() returns a Boolean, you can’t really get much more different than 1 and 0….

So, a precautionary tale: although SQL Azure hosting might have handed over the responsibility for actually performing any DB upgrades to Microsoft, the task of testing and ensuring that your code is up-to-date and doesn’t break from version to version is perhaps greater than ever, since there is no way to roll back or delay the upgrade to your little slice of the cloud.

Google’s Cloud technology to Google Earth & Maps Enterprise

Our vision for Google Earth and Google Maps has always been to create a digital mirror of the world where any user, anywhere, and from any device or platform can access current, authoritative, accurate, and rich information about the world around them. In order to provide fast and timely maps to our users, we’ve developed powerful geo infrastructure that lets us process and serve petabytes of imagery and basemap data to hundreds of millions of users.

We frequently hear requests from governments and businesses – some of whom use our existing Enterprise Earth & Maps products today – that they would like to have greater access to some of the infrastructure we’ve built in order to more easily store their geospatial data in the cloud and more easily build and publish maps for their users.

Today we announced Google Earth Builder, which continues the spirit of providing more access to Google’s core infrastructure, such as Google App Engine and Google Exacycle.

Google Earth Builder is an Enterprise mapping platform powered by Google’s cloud technology. We’ve built Google Earth Builder with the idea that any organization with their own mapping data – be it terabytes of imagery or just a few basemap layers – should be able to upload and manage that data in the cloud. They can use Google’s scalable infrastructure to process and securely serve it through familiar Google Earth and Maps interfaces to their users.

Our goal for Google Earth Builder is to enable Enterprises that work with geospatial data and create online maps to be able perform these tasks in the cloud. Over time we anticipate providing access to more and more of our geo infrastructure through Google Earth Builder, so businesses have more options for how to process, publish and analyze their geospatial data. We’re excited to launch Google Earth Builder in Q3, and in the meantime if you are interested in learning more then please get in touch.

See you at PyCon 2011

As many of you may know, Python is one of the official languages here at Google. Guido van Rossum, the creator of Python, is a Googler too—so naturally we’re thrilled to be supporting PyCon 2011 USA, the largest annual gathering for the community using and developing the open-source Python programming language. The PyCon conference days will be March 11th to the 13th, preceded by two tutorial days, March 9th and 10th. For those of you with coding in mind, the Sprints run afterwards from March 14th-17th. All-in-all that’s nine days of Python nirvana!!

In addition to having many Googlers in attendance, some of us will be presenting as well.

• On Wednesday, March 9th at 2 PM, I will be leading a Google App Engine tutorial with fellow teammate Ikai Lan. Tutorials have gotten so popular at PyCon, they’ve now been expanded into a two-day affair!

• On Friday the 11th, the very first day of sessions, App Engine engineer Brett Slatkin will kick things off with his talk, “Creating Complex Data Pipelines in the Cloud” using the new App Engine Pipeline API at 10:25 AM.

• After lunch on Friday, I’ll take my Google hat off momentarily to discuss Python 3 in my talk subtitled “The Next Generation is Here Already” at 1:35 PM. It is mostly a repeat of the well-received talk I gave last year but with updates. The main point is to introduce folks to the next version of the language and discuss how its backwards-incompatibility will affect users, when users should port their apps to Python 3, what the differences from Python 2 are, etc. My job is to calm and soothe, dispelling any FUD (fear, uncertainty, doubt) about Python 3.

• On Saturday morning at 9:25 AM, Python creator, BDFL, and App Engine engineer Guido van Rossum will do his annual Q&A session for all conference attendees in a fireside chat session.

• Later Saturday morning at 11:05 AM, I’m looking forward to speaking about “Running Django Apps on Google App Engine.” This is exciting for me, not only because it’s a relatively new topic, but it represents a major change for Django developers: being able to write Django apps that run on NoSQL or non-relational databases — it’s been only RDBMSs all this time. Furthermore, with Django-nonrel, you can move Django projects/apps between traditional hosting and App Engine, helping to break that “vendor lock-in” issue that many have had concerns about when hosting apps in the cloud. A good part of my talk does focus on porting apps from App Engine to Django however.

• Right after my talk, at 11:45 AM comes another famous Googler, author of Python in a Nutshell, co-editor of the Python Cookbook, and a long-time member of the Python community, Alex Martelli. Alex’s invited talk on “API Design anti-patterns” will be insightful and cerebral, sure to cause many future hallway discussions.

• Late Saturday afternoon at 4:15 PM, Google engineer Augie Fackler will deliver his talk entitled, “HTTP in Python: which library for what task?” There are many libraries that do HTTP. Which ones should you use and when? What are the benefits and tradeoffs?

• Finally, several members of the Google App Engine team, App Engine forum gurus, and experienced App Engine users are attending PyCon this year. I’m hoping to establish an OpenSpace session one of the conference evenings where we can meet other users, chat about best practices, and do some informal Q&A letting people ask anything they want (except “When will you support newer versions of Python?”). :-)

You can find the entire PyCon schedule online. It’s interactive if you log-in, allowing you to bookmark sessions you’re interested in attending. This will be PyCon’s biggest year yet, so hopefully you can join us in Atlanta next week! Keep an eye out on the PyCon blog to get the latest news, and be sure to follow the Twitter hashtag (#pycon).

We invite you to join Google team members at all our talks, plus stop by our booth to meet our technical staff as they demo select developer tools and APIs. We’ll have handouts there and also encourage you to try a short coding puzzle for a prize!