jump to navigation

How not to construct strings (Lost in translation – part III) December 28, 2006

Posted by globalizer in Java, Localization, Translation.

Back here I said I would cover some of the real problems with software translatability in a follow-up post. This is the first installment, and it covers a fallacy that was fairly common 10-15 years ago, although these days we (or at least I) see it a lot less frequently. This is a good thing, since it basically renders software not just difficult to translate, it makes it totally impossible to translate.

I am of course referring to the good old construction of sentences or even words from fragments by concatenating them in the code. This is done by inexperienced developers trying to save space and keep translation costs down by “reusing” words and strings. In a slightly (but only slightly) parodied version, this could result in code like this (where the directory/file names and counts would of course be pulled in as parameters at runtime instead of hardcoded as here):

ResourceBundle bundle = ResourceBundle.getBundle("ConcatenationBundle");
String the = bundle.getString("the");
String dir = bundle.getString("dir");
String contains = bundle.getString("contains");
String filestring = bundle.getString("filestring");
String line = bundle.getString("line");
String plural = "s";
System.out.println(the + dir + "mydir " + contains + "10 " + filestring+plural);
System.out.println(the + filestring +" " + "myfile.txt "+ contains + "15 " + line+plural);

The resourcebundle going along with this would have this content:


And the proud developer would be able to point to the fact that he has externalized the translatable strings and that his code can produce the following output even though the translation file only contains 5 words!

The directory mydir contains 10 files
The file myfile.txt contains 15 lines

What kind of savings won’t you be able to reap once you implement this kind of code for all your strings, across a major application! You would only have to translate each word once!

There are of course a few minor problems with this way of coding:

  • it assumes that all languages form plurals by simply adding an ‘s’ to the singular form (even though not even English conforms to this rule – think man/men, woman/women)
  • it assumes that all languages use exactly the same syntax/word order as English
  • it does not take into account languages where nouns and adjectives have to agree with respect to number and gender
  • etc., etc.

In short, it makes all translation impossible. And if you don’t find the problem prior to translation (or maybe even prior to the test phase), then the translation is going to be a lot more expensive than it would have been if it were done right in the first place – since you will have wasted the time of a lot of translators and/or testers. Odds are that strings like the ones above in a resource bundle will make at least experienced translators raise a red flag when they see them, but by then it has already cost you money.

Update: Hmm, this style of localization may not be as rare as I thought (hoped).



No comments yet — be the first.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: