Never Say Never Again

Mei-Lei: Can I do anything for you, Mr. Bond?
James Bond: Uh, just a drink. A martini, shaken, not stirred.
James Bond: [after losing 10 million in the game] Vodka-martini.
Bartender: Shaken or stirred?
James Bond: Do I look like I give a damn?

Thunderball/Never Say Never AgainMachine translation (MT) has been around for 65 years, and translation memories (TM) have made their appearance more than 20 years ago.

Nevertheless, interest in MT continues to grow, with more and more LSPs and translation users exploring its potential, while TMs seem to be bogged in the very same issues that have been affecting them since the beginning.

The main hindrance in an even faster spread of MT is still in its intrinsic complexity, in spite of its commodification with Google Translate.

For MT to be used successfully, expertise and knowledge are needed. But, for a basic and yet profitable use of TMs no specific proficiency is necessary.

It took 10 to 15 years to find TMs listed in academic programs, to respond to the urging demands of LSPs for skilled staff. This belated awakening betrays an actual lack of interest among academics, so much so that translation technology courses are usually contracted to practitioners rather than held by tenured professors.

Most of these programs typically focus on basic training centered around one specific software tool, leaving any further developments to the good will of the would-be professionals. This is probably why translators are, at best, sceptical over MT, and apathetic and yet loyal adopters of TM tools.

In the near future the shortage of translators will continue, as working conditions are not getting better, thus driving out the best resources. At the same time, translator productivity will still be stagnating, as the traditional approach and model do not allow for any booming, even with the smartest tools.

Since there will be more and more content to translate, MT will take over much of the traditional market; sooner than expected translators will find no more empty segments in their translation tools.

However, MT is suitable only for certain type of texts, and it’s profitable only for large companies with a high volume of similar material to translate. Moreover, to be profitable, MT engines must be customized. Customization requires a combination of very advanced technical know-how and linguistic skills.

Many MT pundits admit that good MT systems generally produce output that is very much like high fuzzy matches (FMs) when a TM is applied: the better the system, the higher the match percentage in FMs.

A recent study shows that time for PEMT corresponds to that for editing 85-94% fuzzy-matches.

In most translation courses, even where TMs are part of the program, no guidance is usually given on translation pricing, let alone on pricing FMs. It will come as no surprise, then, that graduates find themselves unprepared to respond to requests for proposals for PEMT (Post-Editing of Machine Translation) jobs.

Discount Schemes

Quoting MT pundit Kirti Vashee: “An issue that continues to be a source of great confusion and dissatisfaction in the translation industry is related to the determination of the appropriate compensation rate for post-editing work. Much of the dissatisfaction with MT is related to this being done badly or unfairly. […] While it took many years for TM compensation rates to reach general consensus within the industry, there is some consensus today on how ranslation memory fuzzy match rates relate to the compensation rate.

Much of the hostility towards PEMT amongst professional translators (i.e. post-editing candidates) is the negative effect of being asked to edit poor output for much lower rates. But this is exactly the same as for FMs in TMs.

Not surprisingly, a frequently asked question among translation buyers is “Why do we need to pay for 100% matches? Shouldn’t these sentences be free since they have already been translated?”

This was the main drive for the introduction of TMs some two decades ago, the ‘weighted word count’ concept to save costs. The same concept definitely applies to any match with a translation memory.

Another question, especially among TM novices, is: “How many segments in a TM are needed before the translation process can become profitable?”

In theory, any full match in a TM could be used exactly as is. A 99% match might differ only in a single letter or punctuation mark, where a 75% match might have several different words. Both kind of FM, though, need some sort of editing.

Generally, matches below the 70% mark are not useful, and should therefore retranslated.

Most translation vendors will charge a nominal fee for reviewing 100% matches, and a percentage of the full rate, roughly as follows:

TM Match Requester MLV SLV
No match Full rate Full rate Full rate
75-84% 60% 40% 50%
95-99% 40% 25%
Full match 20% 10% 0-10%

With the average productivity of 2,684 words a day, the editing of even a 99% FM will be worth 8 words and paid for 2. Is the game worth the candle? Only if the number of 85% FM or lower is less than 25% of the total.

The ‘new frontier’ in FMs is offering discounts over sub-segments matches, i.e. where the system compares chunks of segments instead of whole segments, based on the assumption that humans can readily identify matches in parts of speech such as expressions and conventional phrases as a significant part of writing.

Surprisingly, a huge number of translators refuse PEMT jobs, while possibly yielding to exploiting FM pricing schemes or volume discounts.

Nevertheless, many of these professionals admittedly find that editing even average FMs takes at least as long as translating the segment from scratch: discounts should then be based on the effort to edit these FMs.

Therefore, the only TM discount to offer should be on full matches, provided that the client is so confident in the reliability of the TM to accept them with no editing or proofreading.

MT Output for PEMT

According to Common Sense Advisory surveys, PEMT is paid 61% of the human translation price.

Since matches below 70% are comparable to poor MT raw output, with the above discount schemes in mind, PEMT compensation should to be linked to the effort to ‘fix’ MT errors, ‘fixing’ (i.e. ex post remedy) remaining the prevalent approach, instead of doing right the first time (i.e. prevention). This is especially true when considering that MT cannot perform at its best in every scenario.

MT can be really effective only if the input provided is of sufficient quality, thus allowing for convenient post-editing work and enhancing productivity. PEMT effort should then be assessed according to changes to the MT output. Some preliminary measurement is therefore necessary. A single/flat rate pricing approach can prove non-sense.

BLEU and NIST are methods for automatic evaluation of MT. The BLEU metric ranges from 0 to 1, and is generally misunderstood. A translations could attain a score of 1 only if identical to a reference translation, with exactly the same phrasing and vocabulary. For this reason, even a human translation will not necessarily score 1. On the contrary, competent human translations score no higher than 0.7.

MT can ensure consistent terminology and phraseology, and help a simplification of the translation process. It could also produce huge cost savings, although only on huge volumes and with appropriate customization.

MT effectiveness could greatly be improved by using controlled languages (CLs), allowing for clearer and consistent texts, in syntax and terminology, thus having a positive effect also on language data, helping it to be clean and reusable. In this respect, leveraging a TM could result even more profitable, progressively reducing PEMT effort.

Sharon O’Brien dealt with the measurement of the PEMT effort, citing Hans Peter Krings’ empirical investigations, as extensive and comprehensive as Brian Mossop’s Revising and Editing for Translators. On the other hand, no other empirical/statistical study has yet been produced on revising translations, let alone on PEMT.

Krings suggested a methodological approach to measure PEMT based on the time taken to post-edit a sentence to a particular level of quality, on the technical effort required for deletions, insertions and text re-ordering, and on the extent and type of cognitive processes that must be activated to remedy a deficiency in MT output.

However, these parameters remains purely theoritical and, most of all, subjective, with no reference to the suitability of a particular document for MT (i.e. translatability[*]) and on the quality of the MT output.

Studies show a strong correlation between complexity and ambiguity scores and technical post-editing effort. A correlation exists between the average sentence length, the number of predicates per sentence and the quality of the MT output, not only in old-fashioned and yet still valid RbMT engines. Not surprisingly, translatability is a major goals of CLs.

Scientists have investigated the difference between a martini shaken and a martini stirred. The Department of Biochemistry at the University of Western Ontario in Canada conducted a study to determine if the preparation of a martini has an influence on their antioxidant capacity.

Exploring how translators use technology and linguistic data is crucial for any further development in translation technology and training.

Ana Guerberof’s doctoral thesis is the first comprehensive study on productivity with FMs. Traductologues, scienziati traduttologi, Übersetzungwissenschaftlern, estudiosos de la traducción, estudiosos de tradução, translation scholars, where are you? Quosque tandem are you going to leave these subjects to researchers and scholars in fields other than yours? Is still invisibility your only concern?

[*] For more information on TIs, see