This post is meant as a position paper to present my opinion about a topic that has recently risen to prominence and is now under the spotlights thanks to GALA’s TAPICC initiative for which I’m volunteering, in the hope to put the debate on practical and factual tracks.
If you don’t read this here, you will hardly read it elsewhere: many people adore to talk and write about metadata, but rarely care about it. This is because, usually, no one is willing to spend much of his time filling out forms. Although undoubtedly a boring task, there is nothing trivial in assembling compact and yet comprehensive data to describe a job, however simple or complex, small or huge.
On the other hand, this is a rather common task for any project manager. In fact, in project management, a project charter must always be compiled stating scope, goals, and stakeholders and outlining roles and responsibilities. This document serves as reference for the statement of work defining all tasks, timelines and deliverables.
When part of a larger project, translation is managed as a task, but this does not exempt the team in charge to collect and provide the relevant data to execute it. This data ranges from working instructions to running time, from team members to costs, etc. but even LSPs and translation buyers, who might benefit from it whatever the type and size of the project or the task, often skip this step.
The data describing this other data is called metadata and the information it provides can be used for discovery, identification or management. Metadata can be captured by computers, but more often it has to be created manually. Alas, translation project managers and translators often neglect to create metadata, or they do not create enough metadata, or the metadata they create is not accurate enough; this makes metadata scarce and partial, thus rapidly irrelevant.
Measuring is all about reducing uncertainty, which is critical to business. Translation could be a very tiny fraction of a project, although small, and no buyer is willing to put a project at stake on independent variables. Therefore, to avoid guessing, buyers require factual data to assess their translation effort, to budget it, and to evaluate the product they will eventually receive.
Every LSP should then first be capable of identifying what is important from the customer’s perspective to make its efforts more efficient, cost-effective, and insightful. In this respect, measurements enable a company to have the pulse on the business while allowing buyers to assess vendor capability and reliability. To derive indicators and align daily activities to strategic goals, measurements should be taken against pre-specified benchmarks, and analytics are essential to unlock relevant insights, with data being the lifeblood to analytics.
TMSs, Standards and Metadata
In a blog post dating back to 2011, Kirti Vashee asked why there are so many TMSs—at least given the size of the industry and the average size of its players—each one with a tiny installed base. The answer was in the question that followed: because every LSP and corporate localization department think that their translation project management process is so unique that it can only be properly automated by creating a new TMS.
More or less the same happens when it comes to standards, with any initiative starting with the claim and the effort of covering every single aspect of the topic addressed, no matter if it is vague or huge, in contrast with the spirit of standardization, which should result from a general consensus on straightforward, lean, and flexible guidelines.
In the same post, Kirti Vashee also reported about Jaap van der Meer predicting at LISA’s final standards summit event that GMS/TMS would disappear over time, in favor of plug-ins to other systems. Apparently, he also said that TMs would be dead in 5 years or less. Niels Bohr is often misquoted for saying that predictions are always hard, especially about the future, but he would not be wrong.
While translation tools as we have known them for almost three decades have now lost centrality, they are definitely not dead as well as GMS/TMS have not disappeared, and three years from now, we will see whether Grant Straker’s prediction is going to prove right that a third of all translation companies would disappear by 2020 due to technology disruption.
Technology has been lowering costs, but it is not responsible for increasing margin erosion. People who cannot make the best use of technology are. The next big thing in the translation industry might in fact be the long announced and awaited disintermediation. Having completed the transition to the cloud and learned how to exploit data, companies in every industry are moving to API platforms. As usual, the translation industry is reacting quite slowly and randomly. This is essentially another consequence of the industry’s pulverization, which also brings industry players to the contradiction of considering their business too unique to be properly automated, due to its creative and artistic essence, and yet trying to standardize every aspect of it.
In fact, ISO 17100, ASTM F2575-14 and even ISO 18587 on post-editing of machine translation contains a special annex or a whole chapter on project specifications and registration or parameters, while a technical specification, ISO/TS 11669, has been issued on this topic.
Unfortunately, in most cases, all these documents reflect the harmful confusion of features with requirements that is typical of the translation industry. Another problem is the confusion coming from the lack of agreement on the terms used to describe the steps in the process. Standards did not solve this problem, thus proving essentially uninteresting for industry outsiders.
The grand ambitions of any new initiative are the reason for being doomed to irrelevance, while gains may be made by starting with smaller goals.
Metadata is one of the pillars for disintermediation, along with their management, exchange between systems and, ça va sans dire, the exchange format.
In essence, metadata follows the partition of the translation workflow into:
- Project (the data that is strictly relevant to its management);
- Production (the data that pertains to the translation process);
- Business (the transaction-related data).
Metadata in each area can then be divided into essential and ancillary (optional.) To know which metadata is essential in each area, find where and how it can be used.
Metadata and KPIs
Metadata is critical for the extraction of business intelligence from workflows and processes.
In fact, KPIs typically fall within the scope of metadata, especially project metadata, and their number depends on available and collectible data. Most of the data to “feed” a KPI dashboard can in fact be retrieved from a job ticket, and the more detailed a job ticket is, the more accurate the indicators are.
However, to produce truly useful stats and get practical KPIs, automatically-generated data is insufficient for any business inference whatsoever and the collation of relevant data is crucial for any measurement efforts to be effective.
For example, from a combination of project, production and business metadata, KPIs can be obtained to better understand which language pair(s,) customer(s,) service and domain are most profitable. Cost effectiveness can also be measured through cost, quality and timeliness indicators.
A process quality indicator may be computed out of other performance indicators such as the rate of orders fulfilled in-full, on-time, the average time from order to customer receipt, the percentage of units coming out of a process with no rework and/or the percentage of items inspected requiring rework.
The essential metadata allowing for the computation of basic translation KPIs might be the following:
- Unique identifier
- Project name
- Client’s name
- Client’s contact person
- Order date
- Start date
- Due date
- Delivery date
- PM’s name
- Vendor name(s)
- Source language
- Target language(s)
- Scope of work (type of service(s))
- Percentage of TM used
- Term base
- Style guide
- QA results
- Initial quotation
- Agreed fee
- Expected date of payment
- Actual date of payment
Although translation may be a task of a larger project, it may also be a project itself. This is especially true when a translation is broken down into chunks to be apportioned to multiple vendors for multiple languages or even for a single language in case of large assignments and limited time available.
In this case, the translation project is split into tasks and each task is allotted in a work package (WP.) Each WP is then assigned a job ticket with a group ID so that all job tickets pertaining to a project can eventually be consolidated for any computations.
This will allow for associating a vendor and the relevant cost(s) to each WP for subsequent processing.
Most of the above metadata can be automatically generated by a computer system to populate the fields of a job ticket. This information might then be sent along with the processed job (in the background) as an XML, TXT, or CSV file, and stored and/or exchanged between systems.
To date, the mechanisms for compiling job tickets are not standardized in TMSs; metadata is often labeled differently too. And yet, the many free Excel-based KPI tools available to process this kind of data basically confirm that this is not a complicated task.
To date, however, TMSs do not seem to pay much attention to KPIs and to the processing of project data, and focus more to language metadata. In fact, translation tools and TMSs all add different types of metadata to every translation unit during processing. This is because metadata is used only for basic workflow automation, to identify and search translatable and untranslatable resources, provide translatable files to suitable translators, identify which linguistic resources have been used, which status a translation unit has, etc. Also, the different approach every technology provider adopts to manipulate the increasingly common XLIFF format makes metadata exchange virtually impossible; indeed, data as well as metadata are generally stripped away when fully compliant XLIFF files are produced.
This supplement contains some excerpts from the annexes to the two major industry standards, ISO 17100 Translation Services — Requirements for translation services and ISO/TS 11669 Translation projects — General guidance.
The first excerpt comes from Annex B (Agreements and project specifications) and Annex C (Project registration and reporting) to ISO 17100. The second excerpt comes from clause 6.4 Translation parameters of ISO/TS 11669.
This data is perfectly suitable candidates as ancillary (optional) metadata.
All excerpts are provided fur further investigation and comments.
Annex B (Agreements and project specifications)
- confidentiality clauses,
- non-disclosure agreements (NDAs),
- delivery dates,
- project schedule,
- quotation and currency used,
- terms of payment,
- use of translation technology,
- materials to be provided to the TSP by the client,
- handling of feedback,
- dispute resolution,
- choice of governing law.
Annex C (Project registration and reporting)
- unique project identifier,
- client’s name and contact person,
- dated purchase order and commercial terms, including quotations, volume, deadlines and delivery details,
- agreement and any ancillary specifications or related elements, as listed in Annex B,
- composition of the TSP project team and contact-person,
- source and target language(s),
- date(s) of receipt of source language content and any related material,
- title and description of source [language] content,
- purpose and use of the translation,
- existing client or in-house terminology or other reference material to be used,
- client’s style guide(s),
- information on any amendments to the commercial terms and changes to the translation project.
- source characteristics
- source language
- text type
- specialized language
- subject field
- target language information
- target language
- target terminology
- content correspondence
- file format
- style guide
- style relevance
- typical production tasks
- initial translation
- in-process quality assurance
- final formatting
- additional tasks
- reference materials
- workplace requirements