Data.gov Adds Geoviewer

Today, Data.gov added a new capability to its growing arsenal of tools that allow for using the data the website makes accessible. The so-called Data.gov GEO Viewer has some interesting capabilities:

  • Data loaded into viewer in Real-time through web URLs – the viewer downloads data directly from the authoritative source. An ArcGIS Server Geoprocessing service uncompresses data if needed (.zip, .gz, .tar), transforms data to JSON, and streams this back to the flex viewer.
  • The GEO Viewer loads data in Web Mercator (if data or service supports it).Otherwise the GEO Viewer changes its basemap projection to Geographic Coordinate System and loads the data.
  • The viewer supports the following data types:
    • Map Services: OGC WMS, ArcGIS
    • Feeds: GeoRSS
    • Files: KML/KMZ, Shapefile
  • The GEO Viewer allows for mashing up multiple datasets, map services, and feeds in one view. It supports basic navigation using the keyboard (without the need to use the “shift+alt+F7+drag the mouse+release alt and mouse button at the same time”-like features…).
  • Set a basic color for the added data layer, set transparency for the layers, and use a swipe/see-through feature.
  • Basic identify operation on the added data.
  • Switching the basemap.

There are some limitations with this viewer, most of which are due to the fact that it downloads data from the source every time someone wants to see it:

  • File size limit of 10MB – Shapefiles and KML files can have large compression ratios. While the registered file in Data.gov may be an under 10MB KMZ file, this can easily expand into a 100MB KML that then is streamed as JSON features to the client. This simply takes time.
  • The information about the files is not enough to make an upfront assessment of whether the file is viewable or not. Almost every file in Data.gov is a .zip file. The GEO Viewer has to determine if it’s dealing with an Esri Shapefile, OGC KML, Arc/Info Export (e00, remember these?), Microsoft Excel, CSV, or whatever format(s) until after it downloads the file. The metadata in neither raw data catalog nor geodata catalog includes this information. A result is that sometimes users will only be notified that the file type is not supported until after the viewer is launched.
  • Registration of content is not readily usable by an application (James Fee found one of these…). There are several registrations of content that link to web pages or web applications, rather than the actual data. In this case, the content is however also available as an ArcGIS Server Map Service (although that’s not in the registration in Data.gov).

Can I Have One NSDI with Some Confusion on the Side Please?

In this age of publish first, then filter, and instant gratification, it is easy to loose some of the real questions out of sight. The merging of Data.gov and Geodata.gov (yes, that is the plan) raises some questions that have gotten lost in the excitement from the last week.

Here are a couple observations on the subject that could be made by anyone who has been following the two sites over the past year(s):

  1. Geodata.gov harvests most of its content from over 300 other catalogs (visit the Geodata.gov Statistics tab and view the information on Partner Collections). Data.gov does not have this capability. These catalogs represent federal, state, and local government, academia, NGO, and commercial providers of geospatial resources (visit the same tab on Geodata.gov and view the information on Publisher Affiliations). Data.gov on the other hand focuses on content from the Executive Branch of the Federal Government. Where would the remaining content of Geodata.gov go? http://www.otherdata.gov?
  2. Geodata.gov focuses on FGDC+ISO metadata with the industry looking at migrating to the new North American Profile of ISO 191xx metadata. Data.gov has developed its own metadata specification and vocabulary that is quite different from this. Just look at a details page on Data.gov to confirm this. What is the position on this subject of FGDC and other federal agencies who have created standards-based metadata for many years?
  3. Geodata.gov has focused on the GIS analysts and first responders (check the original Statement of Work, I’m sure it’s online somewhere). Data.gov seems to focus on a different audience (although honestly it’s not entirely clear to me if that audience consists of developers or the general public. It’s a bit of both).
  4. Geodata.gov has supported a number of user communities in two ways:
    • by allowing them to create community pages with resources beyond structured metadata that are of interest to those communities. The content in these pages is managed by the communities themselves. How should Data.gov support these communities of interest?
    • by supporting community-oriented collections that group metadata from multiple source catalogs. Examples are RAMONA (the states’ GIS inventory), the Oceans and Coast Working Group (interested in all content in the US coastal zone), and Data.gov (actually, this is also configured as a collection in geodata.gov). These collections are exposed on the Geodata.gov Search tab and in the CS-W and REST interfaces to the catalog.Where would these collections end up after a merger of Geodata.gov and Data.gov?
    • Geodata.gov has created a Marketplace where those who are looking for data and those who have plans to acquire data can discovery each other and collaborate. A dating service of a different kind. While not specifically targeted at the masses, isn’t one of the key principles of NSDI to collaborate to reduce redundant investments?
  5. Geodata.gov has created a search widget that has been implemented by several agencies such as the State of Delaware that enables searching geodata.gov directly from the website and thus getting access to state and other geospatial resources covering the area of the state. This widget can mean significant cost savings for agencies as they don’t have to create their own clearinghouses. Will Data.gov provide such a role as well?
  6. Through FGDC CAP grants several tools were built that work against the Geodata.gov REST or CSW interfaces. I mentioned some of these capabilities and the links to these tools in my recent blog post. Merging Geodata.gov and data.gov would ideally not break these investments.

It would be nice to see the passion that was expressed over the last week be repeated, but now discussing some of these and other questions that affect the geospatial community at large.

Accessing the Data.gov catalog through an open interface

In its first year, Data.gov has grown from 47 datasets to over 270,000 datasets. These datasets aren’t actually hosted at Data.gov. The government agencies making these datasets available, host the files (or web services), and share them with the community through data.gov. But how did these datasets become discoverable at Data.gov?

Actually, the datasets are registered with Geodata.gov, a national catalog of geospatial resources that has been around for some 7 years and that “serves as a public gateway for improving access to geospatial information and data under the Geospatial One-Stop E-Government initiative”.

Geodata.gov provides access to almost 400,000 geospatial resources from over 300 partner collections from federal, state, and local government, as well as academia and commercial providers. Rather than having to sift through as many web sites, users can go to Geodata.gov and perform searches there. Creators of the geospatial resources can register this content with Geodata.gov if they choose to do so.