Unicode Haiku contest October 10, 2009
Posted by globalizer in Unicode.add a comment
Writing a Haiku – one that is not terminally lame – is more difficult than you would think. Adding Unicode to the mix doesn’t exactly make it any easier.
Here’s my own lame attempt:
so many code points
emoji sure to get in -
how ’bout Klingon now?
When having a web site is worse than not having one October 10, 2009
Posted by globalizer in QA, web applications.add a comment
In this day and age it is really a requirement for practically any serious business of just middling size to have a web presence. Not just for visibility and marketing purposes, but to save money. If users are able to complete most of their interaction with a company via self service online you not only get happier customers but you also save a boatload of money on customer service.
However, this of course assumes that you have the wits to implement a customer web experience that doesn’t look as if it were implemented by middle school students (and I apologize in advance to the many middle school students who wouldn’t dream of making rookie mistakes like these):
- Implement a web form that only works with IE
- Don’t provide any clues whatsoever to users of oh, let’s say Firefox, that you don’t support their browser. No message, no visual indication that there is an error, nada
- Don’t ask the user to re-enter the chosen password for confirmation – but of course have 2 of the “security questions” that infect US-based web sites like the plague
- Don’t provide any option for the user to change the mailing address online, thus requiring a call to customer service any way
That’s just the first 4 obvious and insanely annoying things about the Chase HSA online service that I signed up for because I wanted to change my mailing address. Which it turns out I can’t do online…
Now I know what I’ve been working on these past months! September 12, 2009
Posted by globalizer in Android.2 comments
Phew! Finally no need to stick my head in my bag when I answer my phone in public; no need to be insanely vague about exactly what I work with when talking to people outside of Motorola. Now I can just say CLIQ – or DEXT – or MOTOBLUR.
That name in itself is one of the few surprises in the announcement. We have been using the code name internally, and had never heard the official name until a few days ago. So now I suddenly know what I have been working on…
It seems there’s a reason I’m not in marketing, though – I would have thought the concept of a “clique” would have primarily negative connotations (and yes, I get the word play on both “clique” and “click”), but I guess not.
The view from the outside August 2, 2009
Posted by globalizer in Denmark.add a comment
After having lived the past 11 years in the United States, I find myself quite often telling fellow Danes that they don’t realize how comparatively well-designed and sound most public policies in Denmark are. They will complain about (to me) very minor and temporary glitches in health care for instance, and my attempts to tell them how well off Denmark is compared to the United States fall completely flat – because it is impossible to convey just how screwed up the US is in this area.
I am sure most Danes will be flabbergasted about this.
We need an official language, now! February 10, 2009
Posted by globalizer in Language, Silly stuff.add a comment
I just came back from one of the county recycling stations, where I had an interesting conversation with the guy on duty. I think it was about the economic crisis, the fact that it’s even worse than the Great Depression because young people today have no idea how to at least get food on the table even though they have no money (by fishing and hunting), that contractors in the area have absolutely no jobs lined up whatsoever, and that they have been foolish to not save any money when the going was good.
I say I think that’s what it was about . I think I heard snatches such as “Hoover’s days”, “young people”, “contractors”, but the North Carolina dialect was so strong, with an overlay of mumbling, that it might have been about quantum physics, for all I know.
Even though it was a bit uncomfortable (who knows what I was actually agreeing with, all those times when I nodded and said “uh huh” or “yeah”), the experience did provide me with an epiphany.
All those proposals about English as an official language, they have not gone nearly far enough. We need not just an official language, we need a language spoken in such a way that people can actually understand it. Combine that thought with the economic crisis, and think of the possibilities:
- we will need countless language teachers (who won’t need a little refresher course in either grammar or pronunciation?) , so English majors will suddenly be in short supply
- we will need an army of officials who can administer official language tests and certify people, so this will create a huge number of new jobs
- this economic stimulus will eliminate political opposition from right wing Republicans – even those who seem to think that government jobs are not real jobs would have to support it
In short, I have the solution to the deadlock over the economic stimulus package: a spending program which in one fell swoop will garner support both from the “government-expanding, latte-drinking, sushi-eating, Volvo-driving, New York Times-reading” liberals because of the spending aspects and from the “gun-toting, bible thumping bitter wingnuts” because of the support for one of their pet projects.
You can thank me later, President Obama
Update: OK, just to make sure: y’all do realize this is tongue-in-cheek, right?
A slippery Unicode slope? December 23, 2008
Posted by globalizer in Unicode.add a comment
The recent proposal to encode a whole bunch of so-called “emoji” images in Unicode has caused quite a brouhaha the last few days on the otherwise fairly staid Unicode mailing list.
I have never delved deeply into the policies governing encoding decisions in the UTC, but I have to admit that the proposal to encode more than 600 emoji symbols does seem to be a giant step away from the encoding of plain text out onto a very slippery slope indeed. The WG2 “Principles and Procedures” document has this wording about “obscure or questionable usage symbols”:
Obscure or questionable usage symbols
The characters are part of a small or large collection that is not yet deciphered, or not completely
understood, or not well attested by substantial literature or the scholarly community. Or they are
symbols that are not normally used in in-line text, that are merely drawings, that are used only in
two-dimensional diagrams, or that may be composed (such as, a slash through a symbol to
indicate forbidden). Examples include Phaistos, Indus, Rongo-rongo, logos, pictures of cows,
circuit components, and weather chart symbols.
But hey, who hasn’t desperately needed to be able to refer to a “CAT FACE WITH RAISED EYEBROWS AND POUTING MOUTH”
, a “BEATING HEART”
, a “BOMB”
or a “LOVE HOTEL”
in plain text? I come across this kind of gaping hole in the available character repertoire all the time, so I fully understand why we need the emojis
On a slightly more serious note, it does seem rather difficult to reconcile with the WG2 language, something that was noted on the Unicode mailing list and met with this response:
> N3452 specifically mentions “pictures of cows” and “stop sign” as examples of symbols that should not be encoded. Naturally it is a bit of a surprise to see so much official and expert support behind the encoding of COW and TRAFFIC LIGHT.
Right. And as I wrote before, subject to change. Therefore, a future revision of this document is likely to use different examples. The Unicode Standard has contained language trying to define the scope. This language has had to be changed over time, because the understanding of what is and isn’t plain text has evolved. It’s still the case that one doesn’t need the catalog of street signs as Unicode, because nobody is using this full set to communicate in text. The STOP sign is a different matter – it’s becoming something that I can definitely imagine being used in interchange without literally being an encoding of a traffic sign.
As somebody else on the mailing list said: “Flexible principles are a good sign of instability.”
Based on the responses (or mostly lack of responses) from key people on the mailing list I think that this proposal is a done deal, and that only minor adjustments wrt. names, etc. will be accepted. It apparently doesn’t hurt to have the clout of major Japanese wireless companies, plus Google, behind you when you make proposals like these. Even though the mobile phone companies in question used User-Defined Characters in the Shift-JIS encoding (and different ones per company, to boot) which should entail no need for them to be encoded in Unicode. Which tells me that the real mover behind this is probably Google, since they are the ones sucking up content from everywhere, and need to be able to store it in Unicode.
But if the concern is emojis “leaking” into databases (mentioned in a few messages) and polluting them, then I think a reasonable response to that would be: use transcoders which convert characters from private use areas into substitution characters before you stuff them into your databases. This will not make them searchable for emojis, true, but since when is it a major loss to be unable to search for images (which these basically are)? I may use smileys in messages, but do I think it essential to be able to search for them? Not really, no.
Discussion about the proposal as such (is it a good idea to encode stuff like emojis at all) has in fact been discouraged:
>> What is needed most, at this juncture, is not further opinionizing about the value of these proposed characters, but the detailed work of sorting them into the standard. There are enough hard questions to be answered:
>
> So, in other words, the decision to encode the entire set has been made, and resistance is futile.
I’m not in the position to speak for the UTC, or even vote in the UTC. But, yes, I think “resistance” is not helpful.
So, it is very difficult to see what type of “thingies” would not be candidates for encoding in the future – assuming you have the clout/energy to encode them in conveniently available space in an encoding like Shift-JIS with plenty of room for user-defined characters (there is no information about actual usage contained in the emoji proposal, as best I can tell, just the fact that the images are encoded by several major companies).
And what’s the deal with the selection of flags in the current set of emojis? Japan, China, USA and a handful of other countries are represented, but where, may I ask, is Denmark? I can’t wait for the fun we will have when Taiwan’s flag comes up for encoding.
I have been (quietly) critical of previous proposals to encode what I considered frivolous characters like the interrobang (why would I need a new character to write the sequence ‘?!’?), but I have to admit that I agree with this sentiment from the Unicode list:
> You think features on Japanese cell phones are not subject to sudden
> swings of fashion?Indeed. Considering how hard it is to get actually useful *writing* stuff encoded, I really feel sad about this.
Indeed. And how to justify the rejection of Klingon now?
Once you have stepped out onto the slippery slope it is going to be very difficult to get off, and the doomsayers who have been predicting that Unicode will run out of available code points to encode new characters will turn out to be right. Not because we have finally met intelligent aliens with new languages that require encoding, but because we have frittered them away encoding images. Sad.
The discussion continues on the mailing list, and my brief excerpts don’t do it justice. So, read the whole thing, as they say.
Those compound English nouns December 17, 2008
Posted by globalizer in Translation.add a comment
One of the more amusing examples of this problem: somebody is looking for a Handheld Software Engineer
That’s gotta be one giant hand – or a really tiny engineer…
Use a little imagination, for crying out loud December 17, 2008
Posted by globalizer in Eclipse, Programming languages.add a comment
We can probably all agree that having software fill in sensible defaults for us is a good thing - it means that we are not forced to type in or browse for directories, file names, what not.
Having said that – in the world of software development it would also be nice if everybody didn’t just blindly accept all the defaults that are offered. One example is translation file names in the Eclipse development IDE. I have yet to see a plugin where the developer did not use the default file name plugin.properties. And while that is a fine name, once you have several hundred files with that same base name it becomes just a tad tedious to sit and squint at long, long file paths that are also almost identical to try to find the one file you need.
This is probably mostly a problem for software translators, since everybody else involved with software development tends to only deal with one or two different plugins at a time. But the poor translators get to translate hundreds of plugins, and thus juggle file lists with hundreds (or even thousands) of these files. So dear developers: have a heart, use your imagination, come up with a new file name once in a while!
That’s energy efficiency we can believe in… December 15, 2008
Posted by globalizer in IBM.1 comment so far



