Toffees for the future


Forrest GumpOf all the things I did and that, at a certain point, I had to stop doing and I miss, teaching is definitely the one I miss most. After a long career, one ends up with sturdily molded thoughts. But when confronted with a student or maybe two who, although being thirty years younger, are intelligent, primed, and unafraid to express their ideas (even when they do not match the teacher’s opinions, and yet are absolutely reasonable and accurate), it is like being injected with vital lymph.

Discovering that one of your favorite authors shares your feeling and put it in words, plainly and effectively as usual, gives you the same effect. Hardly, though, one would admit that it could get yourself addicted.

Addiction consists in constantly trying to bring in food-for-thought and spur these arguments.

In this effort, early in January 2010, I invited Renato Beninatto to give a lecture on new skills, roles and job opportunities to the students and colleagues of the faculty of translation and interpreting at the university where I was teaching.


Renato ended his lecture with a few predictions:

  • CAT tools will become free or irrelevant before 2015;
  • Only small projects will be managed individually;
  • Large projects will only be collaborative with many translators on the same file at the same time;
  • Productivity will be measured by dozens of thousands of words per day;
  • Rates per word will drop, but revenues will remain the same or increase;
  • “Old” skills (grammar, syntax, vocabulary) will be crucial again;
  • Translation will be more and more creative.

According to Renato, new roles would emerge, from developers of reference material to supervisors, from monolingual translators to post-editors of machine translation, and more.

At lunch, after the lecture, we had the chance to resume these topics, and Renato, as usual, clarified his thoughts.

In his view, translation memories bring a value only for those who know how to use them, certainly not for most of the LSPs or customers. Also, this value is a function of their usefulness. Renato brought in the example of coupons that can only be used for purchasing goods that you do not need and are in fact meant to build customer loyalty.

As far as I can remember, Renato has always been, at least in words, a supporter of the uselessness of revisions, claiming they are used as a pretext for unnecessary changes and they are only the source of new errors. Actually, most errors are, in fact, due to non-compliance with the style guide and the glossary; revisers should verify their application and resolve any deviations, rather than leave their own mark.

What I expected from Renato’s lecture, especially for the benefit of students, was to question some of the cornerstones of professional practice and, even more so, of academic training. In my view, for example, consistence (and the corollary of using as few translators as possible) is a false problem. Even in literary translation, consistency is in fact impossible. No one is able to read a text of several pages and notice any stylistic differences within it: Memory times are too narrow to allow this; at most, one might see some deviation in terminology. If this is difficult in a literary text, it’s even more so in a technical text of a few dozens or hundreds of thousands of words. Even from a purely economic point of view it is nonsense: translating 250,000 words would take a single translator more than 3 months and remuneration would be scarce, both in amount and timeliness. Also, for the sake of consistency and speed, MT is always the better choice.

A major issue is in fact with translation education as, for example, teaching is done based on insignificant samples of texts, both for size and typology. And teaching spans from actual translation to assessment, from text analysis and mining to quality assessment and revision.

Renato is well-known in the industry for his “quality-is-irrelevant” mantra, and yet, despite being widely quoted, this mantra remains unapplied.

Most of his predictions listed above have not been fulfilled (yet). Almost ten years later translation memories are still pivotal, and CAT tools are so essential that they have become translation platforms, both in the cloud or still on the desktop. Translation coordinators (that we are still pompously calling project managers, although most of them can’t even read a Gantt chart, forget about generating one), dodge and scrounge with “projects” of any size, while translators still work mostly in solitude on the same old stuff as ever.

It is true that productivity is now expected to be way over the canonical two thousand words per day, even though this is mostly due to the impact of machine translation on standard practice. Another effect is the drop in rates, although revenues are generally the same and way below the increase of volumes.

We may keep telling ourselves stories, but translation has not become more creative. The causes are more aggressive pressures on prices and deadlines and the increasingly harsh perception of translation as a trivial task. Careful, though, this is definitely not just another effect of machine translation. On the contrary, the availability of free online machine translation and the ubiquity of translation have helped raise the awareness of its importance, but, at the same time, it has opened and progressively widened a gap between expectations and reality that becomes harder and harder to close every day.

The translation community, from academic institutions to industry players, has chosen to cultivate the illusion of having found necessity and sufficiency in itself. This is why it is still trapped in the translation quality hoax and in a permanent hype inflation.

The translation quality hoax

To customers, a translation is a product, a fungible good, a commodity, and it is expected to accomplish a function, support a task, transfer knowledge, etc. This is exactly where information asymmetry comes into play. Quality, in manufacturing, can be objectively measured in end products. Flawed items are discarded and the percentage of defective items in a lot gives a clear figure of quality. In manufacturing, though, design and requirements are central and meeting both in production is essential. The quality of a product is the outcome of the production process including the choice of materials and the means of production. As consumers, we have all learnt to tell the quality of a product from its defectiveness and to be more or less tolerant, depending on price. A product should first be fit for purpose, then its price should at least appear congruous, and finally it should be adequately supported. Anyway, zero defects is a basic condition for any consumer to buy from the same manufacturer again. Unfortunately, translation quality cannot be assessed with equal objectivity. “Modern” translation quality assessment hinges around accuracy, fluency and adequacy, which are largely subject to subjective interpretations. While accuracy is a function of the number of errors (defects), adequacy loosely expresses similarity of meaning, and fluency grades the translator’s own language skills. In the end, there is not one single method to objectively evaluate a translation, an at least fair competence in the language pair is requested, together with specific abilities.

This results in the unavoidable necessity of placing a lot of trust in translators, especially in those working from source languages customers cannot be expected to know. At the same time, this creates the illusion, in insiders, that it is quite easy to distinguish between a good and bad translation. This also explain, at least partially, the obsession for quality in the translation community, and how it has become one of the two most debated topics—the other being rates—and the unique selling proposition of the entire industry, up to becoming irrelevant.

As a matter of fact, no customer has the slightest guarantee of receiving a well-made translation, and often not even the chance of being reimbursed for a defective product or have it replaced. No, guarantee is definitely an unknown concept to industry players. And when translators are advised to sign their translations, they should also be informed that they implicitly underwrite a liability in doing so, they accept to be accountable for it. In toto.

The permanent hype inflation

Narcissus (Caravaggio) Translation memories have been at the center of the TEP process for at least 25 years now, and CAT tools are the favorite tools of the trade, still today. Therefore, they are everything but irrelevant. OmegaT is free, and WordFast Anywhere, MateCAT, SmartCAT, and Google Translator Toolkit are free too, even though they all require a permanently active Internet connection.

However, most professionals still prefer to spend a fortune to buy and maintain their tools of the trade. In fact, although technology is making it easier and easier, translation is still labor-intensive, often monotonous, and error-prone; professional translation requires specific equipment and ability.

It is true that the translation industry is way far from being a hi-tech sector, and yet denying the ubiquity and necessity of technology is as stupid and self-defeating as inflating hypes. In both case, narcissism drives some pundits at conferences to try and look always as being on the edge of tomorrow.

For example, now it is the turn of blockchain technology. Some advocate the implementation and use of translation tracking systems based on blockchain technology to identify the origin of the translation and who performed it too, and to view the entire quality and review process performed prior to delivery.

Leaving trivial technical issues apart for a moment (such as mining and accessing blockchains), what’s the use of knowing who performed a translation task?

As to technical issues, to take advantage of blockchain, a mechanism is necessary called mining to secure the system and enable a de-centralized security. In practice, any new transaction must be first validated to be recorded on the blockchain. Validation implies a difficult mathematical problem to be solved based on a cryptographic hash algorithm. Only when a transaction (block) is validated (solved), it is confirmed, and the validator is rewarded. The computational power required to solve a block, thus allowing its recording in the blockchain, is huge and the associated computational resources hardly attainable if not through computer farms. Also, the value of validations decreases as the number of recorded blocks increases. Finally, a major problem persists in making sure of the identity of all the subjects in a transaction.

To date, blockchain technology has been or is being implemented, for example, by Spotify to connect artists and licensing agreements with the tracks on its service, by BitCar to made fractional ownership of collectable exotic cars possible (a typical example of smart contract), or by De Beer to trace diamonds from the mine to the customer purchase.

Again, what’s the use in knowing who translated a content when you don’t have the ability, tools, and resources to assess its quality? If a blockchain is used to trace the quality controls performed, the stages a translation has passed through before delivery, and the results of each stage this means that the inquirer is not interested in the product, the translation itself, but in the process, and in this case, blockchains can be useful to get rid, at least in theory, of certification audits and surveillance visits of quality management systems.

In any case, since transparency and openness are keys, to grant physical access to view blocks, an open, permission-less, or public, blockchain should be required.

Therefore, since implementing a blockchain as well as granting access to it is no laughing matter, boasting to do so or just declaring to be willing to do so, is a brash and unnecessarily expensive marketing initiative that could prove self-defeating with prospects and customers who are tech savvy enough to be healthily skeptical.

Water, water, everywhere | But not a drop to drink

Oddly enough, many translation industry players have always blamed technology for replacing services, products and habits with others of lower quality, impoverished and/or simplified, while knowing that only human laziness should be blamed for unsatisfactory quality. Of course, this is consistent with the perennial, grueling and inconclusive debate on quality, used as magical mystery word that instantly explains everything and forbids further questioning. And yet, the effects of human laziness, sloppiness, helplessness, and ineptness can be seen affecting language data every day. In fact, a common, relentless complain has been pervading the industry since the appearance of translation memories on the market: Most are flawed, messy, outdated, unreliable.

Why, then, most LSPs still consider translation memories as assets? And why knowing who performed a translation should be important when s/he might extensively use poor data?

However, if, as many seem to think, translation quality is at the top of the list of concerns when it comes to translation, it is precisely because of the trust placed in translation providers. Most customers are scared by the number of reworkings a translation could undergo before meeting the criteria of their local agents.

Technology is then thought to be the perfect escape goat, although the new catchphrase for the whole industry might soon become “be smart.” Hopefully not as smart as some “smart” lock, though, if an app can be written in very short time that unlocks any branded “smart” lock in two seconds. How so? Maybe because the unlock code is the same for all authorized users of the lock and is derived from the MAC address of the device. Or maybe because the Bluetooth traffic between the lock and the app is in the clear. And maybe things can get even worse if the back of the “smart” lock can simply be removed and dismantled with a screwdriver to open the shackle.

There must be a way, then, to test and expose the stupidity of labeling as “artificial intelligence” any new device with vaguely human-defeating features .

The power of data

Translation is a prairie for machine learning (ML): Machine translation is just one application. Two other interesting areas of application for ML in translation are vendor management and quality management.

Vendor management involves requesting bids, vetting many mixed resources, and scrutinizing and comparing different offers.

Vendor management requires specific skills and varied abilities, with the end goal to select the vendor consistently providing the highest quality for the best price at the right time.

With the right set of historical data, this task can now largely automated.

To know more about the applications of ML to translation quality, give look at Smartling’s QCS. And it can still largely be improved.

Unfortunately, most LSPs still have problems with data. Not only do these affect language data, which is largely flawed, outdated, messy, and thus unreliable, LSPs have problems with vendor data, as rarely are vendors consistently vetted and records regularly updated. LSPs also have problems with project data that are supposed to be crossed with vendor data for the effective automation of vendor management.

The condition of disuse, disrepair and complete lack of organization of data is a major impede in the ideal exploitation of ML, although everybody in the industry declares to be well aware of its value and consider data a core asset.

The value of data increases by the day, and it can offer a competitive advantage to translation businesses that should start thinking about and implement stringent ML strategies.

Vendor management is still a labor-intensive human task. Optimizing scrutiny will make vendor rating easier and more frequent, thus more reliable for project managers. To assess vendor performance compliance, metrics must be based on project data, while actionable analytics empower business make strategic decision.

For example, due to constraints and pressures, when selecting resources for a project, vendor managers often tend go for the those who, in their own experience, proved most available and reliable, possibly who they know best, rather than the best suited.

As we head into the world of data-driven efficiencies, getting the right data and this data right is of strategic value. Project- and task-related data provides information that can be used for vendor discovery, identification and management.

Why, then, good data is still overlooked? Because we are unwilling to spend much of our time filling out forms.

Won’t get fooled again—always be skeptical

The next time you hear someone praising the wonders of some new technological devilishness of which you know little or nothing—maybe during a presentation at an industry event—do verify that his/her quotes are correctly attributed: Einstein or Ford or Gandhi might not have said that, and the speaker might not have done his/her homework, and maybe s/he might know very little about the topic.


Author: Luigi Muzii

Luigi Muzii