The report of my death was an exaggeration.
Every now and then, even in the translation industry, some pundits venture into some nefarious prophecies that turn out to be wrong after a while, but that no one apparently recollects. The Internet does, though, so one should be careful with obituaries: many of those who have uttered any would end up under a tombstone before those whose death they have sanctioned. In other words, the best is yet to come.
A favorite subject of premature obituaries seems to be TMs and TMSs. Renato Beninatto anticipated the demise of TMs in several of his talks at industry conferences, the first one probably being at Localization and Translation Thailand 2009. Two years later, at the final LISA standards summit, Jaap van der Meer reportedly said that GMS/TMS would be dead in 5 years and that plug-ins to other systems would replace them. In 2016, Grant Straker predicted that a third of all translation companies would disappear by 2020 due to technology disruption. Two years ago, “Mr. Trados”, as Jochen Hummel likes to say he is still being called, predicted the sunset of CAT tools. In 2011, the very same Jochen Hummel had said that “translation memory should be like a version control system for developers”.
More cautiously, a year ago, Gabriel Fairman, founder and CEO at Bureau Works, wrote that “NLP is shaking the language services industry to its core and it is questioning the use and function of foundational elements such as translation memory”. Fairman insisted that, after having been a keystone of the translation industry for over thirty years, and revolutionary at their inception, translation memories are now dead, but that the industry has not decided to bury them yet.
It looks like Fairman and the others dare making such statements with the sole purpose of bringing water to their own mills. Or their customers’.
Curiously indeed, although the reports of the death of translation memories are exaggerated, some of these very same visionaries convincedly support translation memory marketplaces.
On the other hand, this is clearly a consequence of the idea of translation memories as assets (established a decade or so ago), in a budgetary sense, that apparently persists.
Translation memories are not going to die as long as CAT tools and TMSs remain the primary means in the hands of translation professionals and businesses to produce language data and as long as the legend will stand that a stockpile of translation memories is enough to effectively set up and run a machine translation platform.
It is as if, in a desperate effort to “leave a mark”, the idea prevails that victory would smile to the one who talks the biggest bullshit. In this case, however, the excess of ambition will not prevent even the most ambitious from having their names “writ in water”.
Now it is CSA’s turn to write the TMS obituary with a blog post eloquently titled “TMS Is Dead. Long Live TMS”.
Aside from a few blatant errors in the timeline, it is worth reminding that the birth of the translation industry virtually coincides with the launch and deployment of the first translation memory tool. TMSs came way more than a decade later.
Rather than the need for client-server architectures, the skepticism of industry players (traditionally very conservative) helped delay the spreading of TMSs. In fact, in translation industry events, one may still come across embarrassing debates on the pros and cons of cloud computing. And if ‘FTP’ may sound unknown to many, especially young people, it is only because of a general ignorance of the basics of the Internet. So, one should not be surprised to learn that some have seriously asked about using social media to manage translation projects. What one should worry about is that some of the usual suspects may have taken it seriously.
If translation and localization ‘entrepreneurs’ have been forced to invest in ‘new’ technologies because of the COVID-19 pandemic to meet needs that they should have largely predicted and anticipated regardless of any emergency, then the industry is really in hot water. But if translation and localization ‘entrepreneurs’ have just discovered TMSs for the sudden emergence of the need for “managing data flows, providing access to MT engines, disparate terminology resources and repositories, AI and machine learning automation, and increasingly sophisticated content technologies”, then the industry is doomed.
However, in both cases, they can at least blame the experts for having pampered them in their shortsightedness. By the way, this could be the theme for the industry’s epitaph.
Luckily, these very same experts are well aware that the “ongoing shift is far from done”.
The TAPICC Failure
An epitome of the TAPICC mistaken approach that eventually led to failure was the same attitude as the one behind the increasing usage of ‘BMS’ (Business Management Systems) instead of TMSs.
No translation industry player is any close to having a BMS. BMS sounds fancier and smarter than TMS, maybe because the latter typically addresses the small businesses that would like to be big. In fact, many translation companies do not have a TMS (let alone a BMS) because of the price and lack of understanding of system features. Also, no TMS seemingly presents the typical features of a BMS, i.e. setting policies, practices, procedures, and processes to develop and deploy business strategies and their execution. Most TMSs still focus on handling only the typical translation and/or localization business process.
As Kirti Vashee reminded in the Twitter exchange cited above, “connecting into all high-value content flows with minimal TMS overhead will be more important than building a translation-specific control environment”. TAPICC is DOA because other more universal de facto standards are more viable and useful.
This is most probably due to the choice of the TAPICC working group to go for a full-fledged API rather than a general model. Indeed, TAPICC was a GALA initiative and, as such, unsurprisingly, the heavy weights in it steered this effort from the very beginning towards a semi-proprietary outcome. It was just an excuse to let one or two MLVs have their own API that would look like a de jure standard.
Therefore, the TAPICC failure is just another inevitable—and largely predictable—result of the shortsightedness, dullness and greed of translation industry players, especially the larger ones, and it could be the tombstone over any future chances to achieve a series of standards for translation automation via a joint industry effort.
A Forest of Tools
If a tree falls in a forest and no one is around to hear it, does it make a sound? To some extent, this statement also applies to the translation industry and the tools of the trade.
The reason is still debated for the existence of so many tools, especially TMSs—at least given the size of the industry and the average size of its players—each one with a tiny installed base.
Like the tree in the forest, it is all about perception. Every translation industry player perceives translation as crucial and its business as significant while, in fact, it is as if they were trees in a forest on a desert island. They all believe they are unique, and no tool can suit them perfectly and totally. In practice, the industry is really impervious, but not to crises. This explains how nothing that is good for others lends itself to translation industry players unless it has been universally adopted, possibly for years.
Technical standards should always be welcome, then, but standardization comes from the consensus of different parties, especially industry players, which is always hard to achieve and usually reflects the interests of major stakeholders. For example, the mechanisms for compiling job tickets are not standardized in TMSs, and metadata is most often labeled differently. And yet, translation tools, including TMSs, all add different metadata to translation units at processing because this metadata is used for basic workflow automation.
However, without a common standard, the different approaches to data and metadata manipulation make exchange, let alone interoperability, virtually impossible.
After all, the development of a standard specification involves full mastery of the topic(s) subject to standardization, and although the advanced knowledge of such topic(s) is not strictly required, only the major players have the relevant skills available in-house or can hire them.
This partially explains why, so far, most of the efforts in developing standards for translation have focused on processes much more than on file formats and (system) interactions, and why there are multiple ‘international’ standard practices and guides, in different regions: Although translation is supposed to be a global business, small-scale consensus is definitely easier to reach than a large-scale one.
It is not necessarily the case, however, that the biggest players accept or decide spontaneously to get involved in standardization initiatives, knowing they can have their de facto standards prevail. Therefore, standardization committees end up being under control of medium-large players, who can impose their format and, in doing so, frustrate all other efforts and maintain the status quo.
On the other hand, it is understandable that any business with the necessary skills, capacity, and resources might not be willing to develop a technical specification, let alone an API specification, make it freely available and, in doing so, do a favor to its competitors. In the same way, it is understandable that any initiative in this respect, especially if initiated, run, and controlled by the largest industry players under the aegis of a trade association, is received with skepticism, if not mistrust. This skepticism and mistrust can eventually lead to the failure of a fully-fledged universal API, however audacious the effort, because hardly would vendors and customers accept to implement a competitor’s solution that might also be perceived as an over-extending imposition, on the base of its alleged universality.
On the contrary, a set of recommendations to manufacturers of translation tools and platforms for developing their own APIs to perform the tasks in a translation assignment would possibly be welcome and followed. This set of recommendations might consist of an open and flexible web-service API model addressing the most common use cases. A possible reason for the rejection of such an approach is the lack of understanding of the key architectural principles of the Internet, and the OSI model.
Trust No One
Still goes the fairy tale that “language services companies have tremendous unexploited resources in their translation memories and other language assets for developing small AI projects”. Too bad that, if this were the case, by now most LSPs could do without at least half the staff or replace them with others with skills and abilities that are not exquisitely linguistic. Provided, of course these LSPs were able and willing to pay them adequately, which is not the case.
Those very same pundits also contend that “TMSs will become the operating system for language companies and enterprise localization groups”. However, they acknowledge that “the fulfillment of this vision will depend on the ability of technology developers to agree to common specifications and frameworks for translation data” and that “it will also mean tackling the myriad sources of interoperability problems”. Too bad their involvement to solving these problems has been ludicrous.
Also, these and other pundits from the same club have been feeding virtually any kind of hype, from content curation to IoT, from Big Data to gamification, from augmented whatever to blockchain, but have forget cloud computing.
So, it should come as no surprise that blockchain in the translation industry has vanished in thin air. On the contrary, it is worrying, in this respect, that, far from admitting they were wrong or at least hyperbolic, these pundits now evoke China’s shadow instead of reprimanding those they were passionately supporting for never providing any proof of concept (let alone a prototype) of the solutions advocated. True, there are several companies active in the blockchain space in China, but their initiatives all relate to payments and smart contracts.
Similarly, there is no point in explaining the absence of any real innovation in the translation industry by arguing that this comes from its being a transformation industry, acting as such on innovation from other people and it is not required to deliver any innovation, especially after having long criticized it for exactly this reason.
It is true, though, that even the translation industry thrives on an amelioration orgy starting from a major innovation like TMs and a long-time innovation like MT, with TMSs in between.
In this orgy for metadata, a standard is still missing, one that sets how to label it, collect it, store it and exchange it, i.e. the format to use and that describes this format. If it were true that LSPs have “tremendous unexploited resources”, these are probably ‘linguistic’ metadata. This data does not describe project data, client data, user, and usage data, and so on. Unfortunately, nothing suggests that the industry will have such a standard anytime soon, as the same pundits above look at TAPICC as a revolutionary success.
And yet, this kind of metadata is precisely the data needed to develop even small AI applications that could be interesting for business, keep the industry alive, given its limited life expectancy, with NLP technologies eroding business spaces day after day. The only response that the industry has been able to deploy and curb this erosion is PEMT, which is not even a breakwater. For the last year or two, PEMT has been the new kid on the block, although it is as old as MT, i.e. seventy years or so, but at the current pace of progress of machine translation, PEMT is going to become irrelevant before long, probably in a year or two at best. PEMT will become just another component of the professional duties of anyone involved in any linguistic activity, from authoring to translating, with the demand for PEMT services dropping to zero.
However, rather implausibly, and yet totally expectedly, PEMT is the base for research on user interfaces for translation tools. Honestly, user interfaces are not as bad as some contend. Of course, they could be improved, but here comes the amelioration orgy again: The continuous refining of user interfaces is no innovation.
As a matter of fact, at present, the tools of the trade fall into two major categories separated by a jagged fault. One category gathers productivity tools for linguists, LSPs and customers managing translation and localization projects; the other category gathers the localization tools for developers. These tools are typically not or poorly equipped with management and language support functionalities and user interfaces reflect this slant. They possibly look more user-friendly and modern to linguists than traditional translation and localization tools simply because they focus on extremely specific tasks rather than addressing any possible need of a linguist and a translation/localization project manager.
In any case, most CAT tools now use a multicolumn tabular layout after a long period when the leading market tool exploited the Microsoft Word environment. The underlying implication is the text nature of typical content for translation as well as the traditional sentence-by-sentence approach to translation reflecting the natural attention span of any human being and is still the basis for teaching translation at every level. Maybe a change is needed there.
A Sputnik Moment
In T-Minus AI, author Michael Kanaan wrote about China’s massive three-part strategy for AI that aims at making China the absolute world leader in the development and production of core AI technologies by 2025.
Twenty-first-century China is a prime example of how authoritarian governments can easily accomplish even the most ambitious goals because they do not have to contend with political opposition or obtain social consensus. Cheap wages, non-restrictive labor laws, and huge workforces have made this possible, but one should not waive democracy and civil rights to remain competitive; in this respect, Michael Kanaan claims a new ‘Sputnik moment’ is necessary.
After years of immobility, the recently revived M&A frenzy might trigger a Sputnik moment for the language service industry, when all the available resources should be deployed in response to the threats and challenges coming from new demands. According to Michael Kanaan, China launched its AI strategy after AlphaGo defeated the reigning Go champion Lee Sedol. When is the Sputnik or, better, the AlphaGo moment coming?
Maybe, a first step could consist in stopping bullshitting and avoiding catchphrases like “process automation coupled with AI-enabled human translation services”, apparently brilliant, but in fact meaningless and potentially counterproductive. If only because automation does not equal AI.
Translation and Vending Machines
A COVID-19 pandemic phenomenon has been the birth of a new generation of vending machine entrepreneurs attracted by the relatively low barrier of entry and the wide availability of technologies to monitor and scale operations.
However, a distinctive trait of the vending industry is the extreme stratification with thousands of small-time independent operators and no single entity owning more than 5% of the market, one third of which is in the US. According to a 2020 IBISWorld report, the US market is worth US$7.4B in annual revenue for an average expenditure of US$35 per year per person on vending machine items. However, the slim profits per machine make this business better suited for smaller operators who can minimize overhead costs. In fact, there are 17,898 vending machine operators in the US, mostly small businesses employing just a few people (industry employment totals 53,828 people).
The vending industry is broad and encompasses everything due to the relatively low startup costs, its scalability and flexible schedule. However, starting out in the business comes with its share of hurdles, the first of which is finding the right machine, then the right location that can make or break a machine’s success. A decent location is one not already saturated with machines. Finally come the items that go in the machines whose cost roughly amounts to half of the revenues. So, to make vending work as a full-time gig, an operator must implement economies of scale, building the business up to dozens of machines to generate a livable wage.
Again, does this sound familiar?
The biggest difference between a translation business and a vending operator is in technology. Today, vending machines can be controlled using technologies that have contributed greatly to reducing operating costs while helping scalability of operations, from telemetry tools that have largely allowed newcomers to operate remotely to card readers, apps, iPads, etc. that allow 70% of today’s vendors monitor sales and inventory in real-time.
Blockchain? Process automation coupled with AI-enabled human services? Answer for yourself.