jump to navigation

PluralFormat to the rescue? July 2, 2007

Posted by globalizer in Java, Language, Localization, Translation.

OK, I have been missing in action for a while, I know. A new job (still in IBM, mind you) and a fairly long vacation are my only excuses. Back to business:

Here and here I complained about the localization issues involved in using ChoiceFormat. One of those issues would seem to be addressed by the new PluralRules and PluralFormat API proposal described on the ICU design mailing list recently. PluralRules would allow you to define plural cases for each language, and the numbers those plural cases apply to, while PluralFormat would then allow you to provide message text for each such case. This format would thus be able to handle languages like Russian and Polish, which use more complex plural rules than the ones that can be provided via the simple intervals of ChoiceFormat.

It is of course a step forward that the API will now allow you to actually define something that will work for (all?) languages. As far as I can see we will actually take a step backward with respect to the other problem, however: the format will be even more difficult to handle for translators.

According to the API proposal,

It provides predefined plural rules for many locales. Thus, the programmer need not worry about the plural cases of a language. On the flip side, the localizer does not have to specify the plural cases; he can simply use the predefined keywords. The whole plural formatting of messages can be done using localized patterns from resource bundles.

If this is really true, then the programmer will write a resource bundle that implements the US English keywords (in most cases, anyway), and it will be up to the localizer to know the PluralRules keywords that are defined for her language, and to implement them correctly in the localized resource bundle.

This comment on the mailing list to the proposal would seem to be an understatement:

Separating the rules from Plural Format helps some here, but translators will still have to be able to write the PluralFormat syntax, which is about as complicated as the ChoiceFormat syntax.

I think my ChoiceFormat advice will extend to the new API for the time being: don’t use it



1. More i18n pot holes to be filled by Google « Musings on software globalization - October 26, 2010

[…] formatting capabilities that is all the craze these days. People who have read my previous posts on this topic will know that I am no fan of this type of formatting, and my most recent experiences with email […]

2. Steven R. Loomis - October 28, 2010

These might have been useful comments – too bad they weren’t sent to the above-mentioned ICU Design List! I’m sure such comments would be welcome

I think the idea would be that translation editors could have some sort of viewer, such as the Mac has for date formats http://nerdlogger.com/2008/05/17/custom-date-display-for-osx/

One *could* also have separate choiceformat strings ( one for ‘one’, one for ‘many’, etc).

Good to see you @ IUC34 again!

3. globalizer - October 28, 2010

Good to see you too, Steven!
And yes, I should probably have sent comments to the list, you are right.

I agree that “an interpreter” or viewer that would be avaialble to translators, inside whatever CAT tool they use, would solve the problem. I don’t think that exists – yet!

4. Josh - December 5, 2010

I’ve been reading your blog once in a while and decided to reply for the first time, thank you for writing all of this.
love it!
קידום אתרים

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: