"Upgraded" segmentation rules

One of the biggest headaches...

... for translators in any translation environment tool is bad segmentation. This is particularly burdensome in projects where the wordworkers have no ability to join segments in sentences which were split apart because of unconfigured exceptions for abbreviations or other issues. Here are two examples of improved segmentation rules which have saved translators and project managers many hours of extra effort and which have helped to avoid garbage translation units in translation memory resources.

It is a very good idea to set improved segmentation rules such as these to be the default segmentation rules on your memoQ desktop installation or TMS server so that they are automatically assigned to new projects. This can also be done in memoQ project templates.

The ZIP file with German segmentation rules...

... also contains a text file for testing, based on many segmentation problems encountered during legal and financial translation work over the years. For convenience, the test file is also here separately. Download it and test it against the segmentation rules you currently use for German, or against the memoQ default segmentation rules.

These English segmentation rules...

... are good, but not optimized as far as they could be. For example, the segmentation may fail in certain Article citations, in this text:

The citation was from Amtsbl. M91. This is a test. memoQ rocks. § 61 of the law applies. That is governed by Art. A5 of the bylaws, not Art. 7 of the EU Directive. This is all that (Mr. James, etc.) MY attorneys had to say.

Compare these segmentation rules to your own, or to the memoQ defaults. In the time of focus on segmentation rules in this course, I will point out where these rules and the memoQ default rules may be prone to segmentation failures and how we might address those.

And about those language variants...

... you may experience some issues trying to import the segmentation rules on this page into some of the variants of German or English that you work with. If you don't know how to fix that problem, later lessons in this module will show you how.

Complete and Continue  
Discussion

0 comments