Really, Chrome? September 24, 2011

Posted by globalizer in Language, QA, Translation, Unicode.
With the terrible bloat in Firefox, I have recently been trying to get used to Chrome. I am having a really hard time, though. I appreciate the attempt to create an uncluttered interface, but please – within limits!

Once I finally managed to find the settings, I had to hunt around for ever to find the setting for default encoding. I first looked under Basics, but no luck. Went on to Advanced Options, and found no specific setting there. The Languages and Spell checker button seemed the most likely, but no, I didn’t find it there either.

Where does it hide? Under “Customize Fonts…”, of course.

If I were using the English language UI, I might actually have thought that location far-fetched, but not totally outlandish. Since I am using the Danish language version, however, the connection is just completely impossible to make:

The Danish button says “Customize font sizes…”.  And while it is true that good translations cannot be word-for-word translations, in this case my advice to the Danish translator would be to stay a little closer to the source text.

Another case where the Danish translation would benefit from a few changes:

The first sentence (Det ser ud som om, at du er flyttet”) is not wrong, but it would sound a lot more natural without the “at”. And the second one is just plain wrong – again, a superfluous “at”.  Otherwise the translation looks pretty good. And my main question is actually not about the translation at all; all of this has just been throat clearing leading up to this:

Why on earth is the out-of-the-box default encoding still set to ISO-8859-1 for all the Western European languages, and to various other legacy encodings for languages using other scripts? With Unicode (UTF-8) having surpassed 50% of the web by now?

Et tu, New York Times July 1, 2011

Posted by globalizer in Unicode.
Hmm, the Grey Lady is slipping. These are the current front page blurbs for articles inside:

Click to enhance, and you will see that the opinion piece, 4th from the left, is about a Dutch company hiring autistic workers. Unfortunately, this is the article in question, showing that not even the NYT is immune to the Dutch/Danish confusion

Do Canadians have bigger pockets? April 14, 2011

Posted by globalizer in Unicode.
Or does David Pogue just wear suits with particularly narrow pockets? He seems to think that the PlayBook is about half an inch too wide to fit into the breast pocket of a jacket – and that

Whoever muffed that design spec should be barred from the launch party.

It’s interesting that he focused on that particular point, since a colleague of mine told me that this was exactly what sold a Canadian customer on the PlayBook beta he was comparing to Xooms and iPads a couple of days ago: it just fit into his pocket.

I can’t say I am familiar with the finer points of men’s wear, so I am left to wonder if there is a worldwide standard for the size of suit pockets? Or are there regional differences?

Calling Tex Texin, we need some research into the matter of pocket sizes!

“Security” run amuck – again February 21, 2011

Posted by globalizer in Unicode.
The “security questions” that US web sites seem to think are a great security feature have now officially jumped the shark. A run of the mill online shopping site, vitacost.com, not only requires you to create an account before you can place an order (why?), they also require you to pick two “security questions” as part of the account setup.

They obviously haven’t quite understood what the purpose of the security questions is, however – they actually mask the input in the entry fields for the answers, and require you to “confirm” the answer, thus treating them exactly like password fields. So I now have 3 passwords for a web site I will obviously never use again, since it is way too much hassle. And if I were planning on using it again, I would of course have to write all this wonderful information down somewhere, since there’s no way I would remember it 3 days from now. This is obviously very secure…

And just to explain: I have to write the answers to the “security questions” down because the questions are always about something I have no real answers for: my mother’s middle name, the name of my high school mascot, etc.

Silverowlcreations customer service ftw December 11, 2010

Posted by globalizer in Unicode.
I am completely floored by the customer service I just received from Silver Owl Creations. Not only did I receive my order lightning fast, but I also received a refund on the shipping (which was extremely reasonable to begin with), with this explanation:

I’ve given you a slight refund on the shipping charge to make it match the actual cost of packaging and shipping.

That’s almost enough to bring tears to your eyes.

On top of that, the pieces are beautiful; I have a new appreciation for steampunk (literally, since I didn’t know that the concept existed until 2 days ago). Highly recommended.

Full Unicode repertoire in programming languages? October 29, 2010

Posted by globalizer in Programming languages, Programming practices, Unicode.
Would we be better off if we used a programming language that allowed “the entire gamut of Greek letters, mathematical and technical symbols, brackets, brockets, sprockets, and weird and wonderful glyphs such as “Dentistry symbol light down and horizontal with wave” (0x23c7).” ?

I am usually as gung-ho about Unicode as you can get, but I have to admit I’m a little wary about this. Mind you, it would presumably spur the adoption of UTF-8 as the default encoding in development environments on all platforms, something that’s long overdue. How can MacRoman still be the default encoding for text files in Eclipse on Macs??

Via Computerworld.

Weight of 1 web page: 8 micrograms October 21, 2010

Posted by globalizer in Unicode.
Who knew? The Web measures 8 feet by 8 feet by 20 feet, and it weighs 26000 pounds. Which means that each page weighs 8 micrograms.

At least, that’s the result when you pack a copy of the Web into a shipping container.

Just one of the fun things I learned in Brewster Kahle‘s fascinating keynote address to IUC34.

Go check out archive.org, which has the slightly ambitious goal of “Universal access to all knowledge”. As part of that effort they take a snapshot of every accessible web page every 2 months, so you can use their “waybackmachine” to see what web sites looked like in the past.

But that’s only a small part of the effort, they also scan and digitize books, archive audio, images, videos, etc.

Great resource, and great keynote #IUC34.


Shooting myself in the foot? June 30, 2010

Posted by globalizer in Unicode.
On those web sites that won’t let you register and therefore read their content unless you provide them with everything from your mother’s maiden name to your social security number I usually sign up as a male born in 1901. Just to give their marketing department a little to chew on when they try to tailor their targeted ads.

I am starting to think that may not be such a hot idea though, since my spam folder now seems to consist of approximately 90% viagra promotions, with a few feeble Nigerian-style scams thrown in.

So I wonder if the spammers really are that specific in their targeting – would I be getting spam about dieting if I had signed up as female? – or is this proportion really the norm these days?

Wheee! June 8, 2010

Posted by globalizer in Unicode.
It didn’t take long for the speed down the slide to pick up.

Remember the emoji encoding discussion back in 2008? Well, Unicode 6.0 containing the new “characters” only just came out in Beta, but as predicted, they are now being used as the justification for encoding – well, just about anything…

For instance, a proposal to encode “a portable interpretable object code into Unicode”:

> Creating new writing systems, directly embedding language,
> directly embedding mathematics or machine language–all of
> these are entirely outside of Unicode’s purview and WG2’s
> remit.  They simply will not be adopted.

Well, the emoji is a new writing system and that is being encoded. The encoding of the emoji has made me realize that the encoding of the portable interpretable object code is not an impossibility.

> Your enthusiasm may be commendable, but you’re spending
> your energy developing something which is not appropriate
> for inclusion within Unicode.

Thank you for your first remark, yet whether the portable interpretable object code is or is not appropriate for inclusion within Unicode is a matter that is not decided at this time.

There was a time when emoticons were not regarded as appropriate for inclusion in Unicode, yet they are now being encoded. That is an important precedent that what is appropriate depends upon the circumstances at the time, not on what was the policy previously.

Admittedly, the current proposal seems to be a solution in search of a problem. The author indicates that it

is intended to be a system to use to program software packages to solve problems of software globalization, particularly in relation to systems that use software to process text

but even though I work with software globalization on a daily basis, for the life of me I cannot think of something related to software globalization that:

  1. I want to do
  2. I cannot do with existing technology and standards
  3. This proposal will allow me to do

This specific proposal of course has a snowball’s chance in hell of being encoded, but the emoji argument will be a lot more difficult to counter once we get to something that is at least conceptually related to text. So hold on to your hats as the slide gets steeper and more slippery!

Oh, and by the way: for sheer entertainment value, the last couple of weeks’ worth of Unicode mail archives is priceless.

Paul Krugman can get a job as my terminologist any time April 21, 2010

Posted by globalizer in Unicode.
Inventiveness I can only dream of.