Single-language labels in Google Maps

If you use Google Maps in English, you might notice we’ve expanded our coverage of translated labels. Previously, map labels would display both transliterated and local names for many places. Using a single language can help users by making the map easier to read. For example, the label for Moscow on our English maps used to display as “Moskva (Москва)”. This was great for learning how places are named in other languages but also resulted in twice as many labels on the map.

Maps with dual language labels (left) and new single-language labels (right)

Below are some nice examples of improvements in Europe and China. Many Italian cities are now labeled with translated English names and China now has province names in English.

Before (left) and single-language labels (right)
Before (left) and single-language labels (right)

We realize it can be useful to see local language labels for learning place names so we’ve kept the option available. If you prefer to see local language labels, you can still do so by unchecking English in the maps menu (move your cursor to the ‘layers’ menu in upper right corner of the map and un-check English when the menu drops down). This is the same for other single-languages which Maps supports. Also note that we may still show multiple labels in some places where there are two local languages or when place names are disputed.

In total, Google now has single-language Maps for 5 major languages – Chinese, Japanese, Korean, Russian, and now English with more languages on their way. We hope this change makes it easier to browse, explore and discover the world around you.

GoogleLatLong

Google Goggles with new look and Russian

Some of you may already be using the new optical character recognition (OCR) and translation of Russian in Google Goggles that we previewed at last week’s Inside Search event. Starting today, we’re pleased to introduce some additional new features, including a map view of your Search History and the ability to copy contact and text results to the clipboard. We’ve also changed the results interface to make it easier to view and navigate through your results.

Russian optical character recognition (OCR) and translation
Since Google Goggles first launched in 2009, it has been able to recognize and translate text in a number of different languages, as long as the language used Latin characters. With the launch of OCR for Russian, Goggles is now able to read Cyrillic characters. Goggles will recognize a picture of Russian text and allow you to translate the text into one of over 40 other languages. Russian OCR is also available for users of Google Goggles on the Google Search app for iOS. Очень полезно!

You can take a picture of Russian text and translate it into over 40 languages.

Map view of your search history
If you’ve enabled search history on Goggles, your history contains a list of all the images that you’ve searched for, as well as some information about where you performed the search if you chose to share your location with Google. Sometimes this can be a pretty long list, so we wanted to give you another way to sort and visualize your Goggles results.

We’ve added a map view, which shows your Goggles image search history on a map so you can quickly zoom and pan to find a query from a particular location.

Easily toggle between map view and list view with the button in the upper right.

Copy contact and text results to clipboard
Finally, imagine that you wanted to grab a URL or telephone number from a sign and email it to yourself. Now, Goggles gives you the option of copying the recognized text to your phone’s clipboard, allowing you to paste the test into a number of applications.

To try these new features download Google Goggles 1.5 from Android Market, or scan the QR code below with your phone.

 

Pseudolocalization to Catch i18n Errors Early

Pseudolocalization is the automated generation of fake translations of a program’s localizable messages. Using the program in a pseudolocale generated in this manner facilitates finding bugs and weaknesses in the program’s internationalization. Google has been using this technique internally for some time, and has now released an open-source Java library to provide this functionality at http://code.google.com/p/pseudolocalization-tool/.

Take a look at the following screenshot and see if you can spot the internationalization problems:

Some common problems in localizing applications are:

  • Non-localizable text A program may have hard-coded text in the program source itself, or parts of messages may come from other non-localized sources (such as from a database), or there could be bugs in finding and using localized messages. The accents pseudolocalization method helps identify these problems by replacing US-ASCII characters with accented or otherwise modified versions, while still remaining readable. That way, if you see unaccented text, you know that text will not be localized for real locales either.
  • Combining separately translated sentences or paragraphs A program may “piece together” complete sentences/paragraphs from smaller translated strings. This is a problem since some locales may require that parts of the sentence/paragraph be reordered, or the translation may depend on the context of what else appears in the sentence. The brackets pseudolocalization method adds [ and ] around each translated string to clearly show translation boundaries. If you see a sentence/paragraph broken up into multiple bracketed pieces, you know it is likely to be impossible to translate well for some locales.
  • User interface elements don’t give sufficient space for translated text Some languages, such as German, frequently have translations that are much longer than the original message. If the user interface does not allow sufficient room for such languages, parts of it may appear wrapped, truncated, or have unsightly scrollbars in such locales. If the developer only tests with the original language, the developer may not discover these problems until late in the development cycle. The expander pseudolocalization method addresses this by making all the strings longer. Looking at the resulting pseudolocale will allow the developer to find such problems long before real translations are available and without requiring knowledge of other languages.
  • Improperly “mirrored” user interfaces in right-to-left locales Some languages like Arabic and Hebrew are written right-to-left. A user interface in a right-to-left (RTL) language needs to be laid out in a mirror image of a left-to-right layout. The only way to tell if this has been done properly is to try the user interface in an RTL locale. However, using a real RTL language is problematic both because translations are unlikely to be available until late in the development cycle and because few developers might be able to read any RTL languages.
Using untranslated LTR messages in an RTL user interface is problematic because it masks real problems and creates apparent problems where none actually exist. Just setting dir=”rtl” on the HTML element of an otherwise-English UI produces something like this:

Note how some of the punctuation is misplaced and not mirrored, the radio buttons are a mess, and the order of the menu items isn’t mirrored. None of these happen to be real problems — they will disappear when the English strings are replaced with real translations.

The solution is the fakebidi pseudolocalization method, which takes the original source text and adds Unicode characters to it to make it behave just like real RTL text while remaining readable (though backwards).

We find the most useful combinations of pseudolocalization methods to be accents/expander/brackets for finding general internationalization problems, and fakebidi for finding RTL-related problems. We use BCP47 variant subtags to identify locale names that get pseudolocalized translations: psaccent (as in en-psaccent) gets the accents/expander/brackets pseudolocalization, and psbidi (as in ar-psbidi) gets fakebidi. Note that for psbidi, using a real RTL language subtag is recommended since that will trigger RTL handling in most libraries/frameworks without any modifications. We hope to get these variant tags accepted as standard.

We have taken the sample application from above and run its translatable text through psaccent and psbidi pseudolocalization. Now take a look and see how much more easily internationalization problems can be identified:

Note that three problems have been revealed by the use of psaccent:

  • the space provided is insufficient for longer translations
  • “Add Contact” is split across two messages making it difficult to translate correctly
  • the button text has not been translated

We can fix those problems, then check for Bidi problems by using psbidi:
Notice this doesn’t introduce problems that aren’t there like the earlier example, but it does show that the Help menu item does not float to the left side of the window as it should.

Initially, this is just a library for use with other tools. We plan to write a command-line tool for taking message sources and producing fake translations of them. In addition, we are in the process of integrating this library with GWT, so GWT users can take advantage of it just by inheriting one module.

In summary, pseudolocalization is useful for finding internationalization problems early in the development process and enabling the developer to fix them before wasting money on translations that may have to be changed to fix the problems anyway. We hope you will use this library to help make your application usable by more people, and we welcome contributions and discussions at https://groups.google.com/forum/#!forum/pseudolocalization-tool.