SurrenderAre the results of neural machine translation (NMT) usable in the same way as those of statistical machine translation (SMT)?

Following a typical machine-learning approach, both methods are probabilistic, although the decoders—the ‘translation engines’—work differently.

In SMT, the decoder is essentially a search algorithm. For each word and group of words in any new source sentence, it looks up the phrase table with words and groups of words from the training data, and it extracts the occurrences with the probability of equivalence. The best fit is the one with the highest probability score.

In NMT, the decoder extracts word sequences from parallel data, converts them into mathematical representations and label sequences with probability scores. The decoder then tries to translate full sentences by constructing a new sequence horizontally, beginning to end, with each sequence of previous words determining the next word. Then the decoder
converts the resulting mathematical representation into words in the actual ‘translation’ stage.

SMT and NMT both try to find patterns in training data, and then to solve a puzzle by constructing a design with the available pieces that is as close as possible to the patterns. The main difference between the two methods is in the way to look at data, and in the length of sequences. In simple words, SMT outputs partial data, NMT whole
sequences. For all these reasons, NMT requires very much larger and better training data than SMT and even larger than any TM.

Labor Limae

For the reasons above, translators working on suggestions from an MT engine do not post-edit, they still rather translate. In fact, they must first accept or reject these suggestions, and, to do so, they compare them with a tentative translation in their mind.

In this respect, editing plays a major role, at least when MT suggestions are not as good enough as a viable alternative to human translation. This process gets closer to post-editing when MT suggestions are so good that the translator only needs to slightly amend them where necessary to make them acceptable.

As a matter of fact, real (heavy) editing involves the insightful alteration of the structures of the MT output, and this might collide with the rationale in MT usage. On the other hand, editing is essentially labor limae, filing and sanding.

The review of monolingual SMEs on full-MT texts is post-editing, when almost no contrastive analysis is necessary; indeed, in this case, post-editors might not even have to look at the source text, provided they are in a bilingual capacity, and the need for re-translation is highly occasional.

Post-editing in Times of NMT

The improvements in fluency may prove dramatically misleading when it comes to assess an NMT output, especially as for accuracy, although it also has improved, and adequacy.

Also, despite the different approach of SMT and NMT to source text, estimates of the extent of post-editing of MT outputs are increasingly made beforehand, where possible at the document level, using predictive algorithms, while all changes are made at the sentence level. It could not be otherwise, though, since ‘quality’ evaluation is made at the sentence level, based solely on abstruse, old-fashioned models revolving around the type, amount and severity of errors.

Therefore, even with strict guidelines, subjectivity still plays a major role, while post-editing should only focus on making MT output effective by meeting three essential prerequisites:

  1. Speed (i.e. being significantly faster than human translation);
  2. Ease (i.e. lessening typing to reduce the likelihood of introducing new errors);
  3. Ability (i.e. reducing search and validation).

Nonsense and the Rationale for MT

The fundamental reasons for using MT are productivity, speed and consistency. All of these translate into money, which still rules, as usual, in business.

How does consistency translate into money? The savings from always using the same words and the same style can be huge. Just think of the reasons behind the development, adoption and maintenance of controlled languages like ASD-STE100. By the way, the translation demand from aerospace industry manufactures, as well as from many other hi-tech multinational corporations, is still significant, even though they would not have to translate, at least not by regulation. In many of these companies, being fluent in English is a conditio sine qua non to get a job.

Effectiveness has never been a concern of academic researchers, especially when not involved in applied research. This is exactly the case of translation departments at universities and it comes as no surprise that many studies may look like onanism, without even considering any poor scientific premises. And yet these might be no less than essential when you mean to prove the deterioration of a language for MT-related practices.

It also comes as no surprise that a totally unreliable industry news business emphasizes such nonsense, as it regularly passes off as news the outcomes of anonymous ‘surveys’ on less than 0.5% of its subscribers, which constitute less than 2% of the entire population of the industry. It is surprising that a major academic body gives space to such nonsense within the official schedule of its annual event. Well, rather unsurprising, actually: canis canem non est.

Most likely, indeed, business users of MT will not be concerned with translation universals and the “laws of translation” such as simplification, normalization and interference, which, by the way, cannot be measured. And notoriously, only “when you can measure what you are speaking about, and express it in numbers, you know something about it”.

And when addressing MT for its use for dissemination purposes, saying that it is “not a solution that replaces humans, it is making content that you have no time for available” is just another way not to scare translation industry players. Also, if “quality needs to be purpose-driven and the standard able to prove its business usefulness” the whole industry has yet not learnt to spot the Pied Piper. In fact, while it is true that “in the industry, the lack of technological understanding creates logical resistance to change”, the many compatibility issues that still exist are the testimony of the haughtiness of players, on both the service and the technology sides, not to mention the powerlessness of think-tanks and consulting firms.

Therefore, hypocrisy rules undisturbed, with expectations contrasting attitudes. Translation industry players keep asking for manageable and budget LQA methods and tools, while resorting to the same old cumbersome and costly paradigms, even for MT.

Not surprisingly, finally here comes the genius, the one who reinvents the wheel, in this case the one who finds that “MT tends to choose a subset of all the possible translation solutions (the most frequent ones)”.

The MT Market

MT market reports abound recently. This is because the interest in MT has been steadily growing for at least a decade and the hype around NMT has given it a further boost.

In fact, all reports expect the MT market boom at a CAGR between 15% and 24% during forecast period 2019-2025. The fork says that forecasts might not be realistic and reliable. However, all these forecasts are based on the very same data from the very same sources.

These reports would be nothing more than a guessing game, the kind of game the typical translation industry ‘analysis firms’ are excellent at.

These reports also say that MT has long been in the plateau of productivity, despite the recurring hypes and that the limit is constantly moving a bit forward every day. The universal translator seems ever closer. Recently, Google released a research paper entitled “Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges” on multilingual models to reduce training and serving cost and simplify deployment in production systems. It is a further step in the path to a system capable of translating between any language pair. Realistically, researchers acknowledge there is still a long way to go. What really matters is that Google seems more and more the only company with the resources to crawl, extract and clean parallel sentences from the web in a very vast range of domains and one of the two or three capable of leaving their mark and leading the way.

To setup and run an effective NMT platform, at least a basic competence with machine learning is necessary, and this is anything but ordinary, still today, especially in the translation industry. Also, a deep understanding of data, especially one’s own, is essential. Finally, the ability and resources are crucial to measure and react to failures as well as to any outcomes pointing to a possible path to improvements. And these too are definitely uncommon in the translation industry.

And yet the myth still pervades the industry that anybody with a stockpile of TMs can easily set up and run an MT platform, that a handful of data and a robust and powerful computer set are enough. Even though too many people in the industry still look suspiciously at the cloud. And this myth persists despite the many unfortunate tries and the increasingly meager patrol of early venturers. And it is no coincidence that most of them have already taken the service highway.

Anything more to expect? Possibly, the most notable news comes, as usual, from one of the tech giants above that has recently announced to be close to large-scale document-level NMT. Curiously, or maybe not, their paper for WMT 2019 contains the word ‘translationese’, with a very specific connotation. Unfortunately, as long as quality assessment remains at the error-catching level, it will also remain at the sentence level, and this might just stir more entropy, at least in the translation industry still looking tied to obsolete and admittedly inefficient models. Maybe further improvements in Automatic Post-Editing (APE) will forcedly bring to a change. Hopefully not too late.

The View from atop a Skyscraper

In less than a hundred years, the world has changed more than anyone could expect. In 1930s, ironworkers were an archetype of modernity, as well as of craftsmanship and of course courage.

Charles C. Ebbets’s iconic picture Lunch atop a Skyscraper has also been used to depict the condition of workers according to eras. At the end of the last century, it was reworked to depict the so-called digital workers. Recently, it has been reworked again to depict the near future of work.

Tech companies leading the way with MT have not been making use of any of the data that might indeed be vital to most language industry players, simply because it is insignificant to them.

It is undisputable that the rationale for MT, with its wide adoption in the translation industry, is in productivity, speed and consistency, despite academic onanistic delusion. And any effects of post-editing on languages should be observed in a much longer period than a year or two, on much larger samples, in many different language combinations, otherwise it is delusion more than onanism.

By the way, what about the effect of translation of literature and movie scripts for subtitling and dubbing?

This is not the right educational stuff to prepare future generations of language specialists to deal with NLP. Perception is no science, while science, specifically data science, should become part of education programs.

Again, PEMT is as old as MT, so it is turning seventy this year. Older than the translation industry, older than many academics that started their career way after the birth of MT (and PEMT).

Here Come the Consultants

However big and fancy you are or may look, you are not immune from mistakes. This is where a consultant may help, to try and prevent them.

Usually consultants do not embark into predictions or riddles. Occasionally, some may guess, or better infer from historical series.

So, when you peck a consulting firm generically quoting ‘experts’, especially for improbable forecasts (five years from now are an eternity today), you should better look for references, possibly for solid ones. And if you cannot find any there, these forecasts are just more bullshit. Quoting improbable ‘experts’ is solely for avoiding the trouble of doing some challenging homework and clearing the field from hype, buzzwords and nonsense.

Consultants who are really honest and independent act as compasses and help customers find their own way, acting as compasses. So, beware of pipers and keep your eyes and ears wide open and your brains working and clean from bullshit.


Author: Luigi Muzii

Luigi Muzii