jump to navigation

OpenTM2 – yes, I was overly optimistic June 30, 2010

Posted by globalizer in IBM, Translation.
Tags: ,
trackback

As I hinted here, my initial enthusiasm about the OpenTM2 project was misplaced. The version currently available is more or less useless, since it supports very few file formats.

One of the major strengths of the internal IBM Translation Manager is its support for an incredible variety of file formats – with no pre- or post-processing needed.

That last point is the key, since that is, IMHO, one of the weaknesses of other TM tools (the need to transform files, and the resulting errors).

File formats are supported via markup tables, and unfortunately only 4 file formats are supported as of now:

  • HTML
  • Plain text
  • Double quote files (translatable text contained in double quotes)
  • Single quote files (translatable text contained in single quotes)

And some of the included markup tables (Unicode encoding) do not actually work, since the required user exits are missing.

This problem would not be so bad if adding new markup tables were as easy as suggested by the Translators Reference Part 4 document (which, by the way, is an interesting title for a document containing mainly API references and code samples…):

You can create your own markup table by exporting an existing markup table in
external SGML format, modifying it with any text editor, and importing it back
into OpenTM2 under a different name…To become familiar with the content of markup tables you might want to export a
markup table and study it before you create a new markup table.

Piece of cake, right? Just edit an SGML file and import it! Yes, if you only need to add some additional tags, etc., but basically useless if you want to add support for an entirely new file format, since the basic syntax needs to be defined in a user exit, coded in C. Confirmed in one of the discussion threads on the support forum:

The latter one is coded
in “C” and it requires special skills to develop it. So I suggest that
you wait a little until the OpenTM2 documentation is fully available
and I’m sure there is a section describing the development of markup
tables in more details. The reason why I say this is that it is
essential that OpenTM2 “understand” all important file types
(OpenOffice, MS Office, XML, HTML, RTF, etc.). The actual set of
markup tables is only a very basic one and does not support the most
important files types on the market. So stay tune for more to
come …. ­čśë

So the “free solution” for freelancers hoped for by many is not there yet – and may never materialize, since localization companies with the resources available for developing markup tables (if they are out there) may choose to then make them part of commercial products rather than just contribute them to the open source project (Eclipse model).

Advertisements

Comments»

1. Kirti Vashee - July 1, 2010

You could perhaps appeal to some in the TM2 alliance to see if they could rectify this.

Or perhaps comment on this blog: http://welocalize.blogspot.com/2010/06/welocalize-and-ibm-partner-on-open.html#comment-form

2. globalizer - July 1, 2010

Well, I am not sure that posting to the support forum and opening defects against the project are not the most effective means of communicating the problems to the project; those would seem to be the official way of doing it. But I can certainly add a comment to the welocalize blog pointing to this. Thanks for the suggestion.

3. Helena Shih - July 2, 2010

Good discussion. In reality, the non-opensourced version of TM/2 supports more than 350+ formats and if that was just dumped out to the public domain with additional considerations, one would expect to see some negative feedback as well.

Practically, not all the formats are useful and quite frankly, they are not suitable for just putting out there without some rework. Also, the focus of this project is about open standards and open source. Therefore, my personal preference is to focus on getting the formats that are important to the open community out there first and not the proprietary commercial ones regardless of how popular they may be. However, that is something the steering committee should decide and not just for one or two individual organizations to put stuff out there willy nilly.

I am glad to see input coming in about the obvious deficiencies we are already aware of and plan to address some of it. I would be estatic if the community also chips in and contribute to it for the greater integration and interaction of all the disparate tools.

4. globalizer - July 2, 2010

Hi Helena,
I completely agree that a large portion of the internal IBM markup tables are irrelevant, and that it would serve no purpose to dump them into the open source version.
The current set is, however, way too limited to make the tool immediately useful, so I think it is urgent to get a small, core set of tables added – while the buzz from the announcement is still around. And I really don’t think that anybody outside of IBM is in a position to contribute those anytime soon, unfortunately.

As you can see from my initial reaction, I am a big fan of Translation Manager, so I think it would be a great pity if this initiative fizzles.

5. Helena Shih - July 2, 2010

There is already a discussion to get some support for XLIFF, HTML, XHTML and OpenDocument format out there. Steering Committee would most likely approve it so that should be good news to the community.

6. globalizer - July 2, 2010

Yes, excellent news, I agree. I would argue that support for Java files (including .properties) should be added to that list. Then we would have a good, core set that would enable many products to start using it immediately.

7. Helena Shih - July 2, 2010

Have you been tapping my phone line? Java property file was supposed on the next wave after the first ones are out.

globalizer - July 2, 2010

Heh ­čÖé

8. Alan - July 5, 2010

I was wondering if WordPerfect or Lotus would ever be supported. The Canadian government is still (unfortunately) mired in WP and Lotus docs. No one in the TM industry seems to support these two formats and it is frankly very frustrating. Does your internal version support them?

Trust me when I say (and I should know) that we need these formats to be supported by someone, somewhere.

Thanks

9. globalizer - July 5, 2010

Hmm, I would think it highly unlikely that anybody would be willing to invest resources in those two dinosaur formats. I believe that the internal IBM Translation Manager would have legacy support for the Lotus format, but not WP. Helena would know for sure.

10. A brighter side of opentm2 « Musings on software globalization - July 5, 2010

[…] Posted by globalizer in IBM, Tools, Translation. Tags: opensource, opentm2 trackback Even the limited version currently available offers glimpses of the tool’s […]

GerhardF - July 6, 2010

In ref to WordPerfect: in IBM there is a markup table supporting WordPerfect 5.0 and 5.1. It’s VERY old and wasn’t used over many years. In IBM a discussion needs to start whether it can be added to OpenTM2.

In ref to Lotus: which LOTUS are you referring to? Lotus NOTES content or design documents? or Lotus SmartSuite formats?

11. globalizer - July 6, 2010

I took the Lotus reference to mean Lotus AmiPro, the word processing format, that was the basis for my comment above (I seem to remember a markup table for that).

GerhardF - July 6, 2010

Yep … there is a markup table for AmiPro V.2 & V.3. named EQFAMI … as for WordPerfect (EQFWP), a discussion in IBM is required how these two markup tables can be “externalized” and used in OpenTM2.

12. Helena Shih - July 6, 2010

Let’s bring the discussion back to my original posting. We are interested in open standards and open formats. Less so on the proprietary formats and commercial offerings.

13. globalizer - July 6, 2010

I agree that it would be a waste of effort to do anything to support such old, proprietary formats.

And surely the Canadian government will come to their senses at some point?

14. tex - July 17, 2010

Good post Elsebeth!

Driving toward open standards and formats is a good direction, and ignoring older legacy formats makes sense, but there are some commercial proprietary formats that may be required to be a competitive industrial tool.

However, until a few of the key mainstays are supported (like XLIFF, HTML…) discussing anything else is a distraction and diversion of resources.

A (tentative) roadmap to the availability of planned formats would help everyone understand when the tools could be meaningfully considered for use.

15. Der Traum vom perfekten TM-System … - July 22, 2010

[…] Globalizer beschwert sich ├╝ber die Unterst├╝tzung von nur wenigen Dateiformaten, und das Software Handbuch, was ├ťbersetzern helfen soll, aber quasi nur von Programmierern verstanden werden kann. In dieser Diskussion erf├Ąhrt man, dass der Export von Terminologie in das universelle TMX Format noch nicht funktioniert, wobei gerade dieser Aspekt der nicht an einen Hersteller gebundenen Formate neben den Kosten der gr├Â├čte Anreiz von OpenTM2 ist. Leute, die kein Windows benutzen, haben ebenfalls Pech gehabt, weil andere Betriebssysteme auch noch nicht richtig unterst├╝tzt werden. […]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: