Quality Assessment and Economic Sustainability of Translation

This article was originally published in issue 9 (2006) of the International Journal of Translation of the University of Trieste and in Upstream.


While quality is a mature and widespread concept, the associated values are not conventionally and absolutely measurable since quality itself is a relative concept, which makes sense only when compared to a set of specifications. Today, the concept of quality broadly corresponds to product suitability meaning that the product meets the user’s requirements.

But then, how does one know when a translation is good? No answer can be given to this yet very simple question without recalling translation criticism and the theory of translation. The relationship between a source text and its translation text is unfit to solve the problem, as readers often perceive the end-product of translation as the only material available for scrutiny; they have no interest in the translator’s decision-making process (the hermeneutic process).

Therefore, translation adequacy should be taken into account in assessment especially when the customer imposes his own subjective preferences (requirements).

Whatever is worth doing at all is worth doing well.
Philip Dormer Stanhope


The development of modern quality concepts is due to the American statistician Edward Deming who was appointed by General Douglas Mac Arthur to oversee the re-building of Japan after World War II.

Although mature and widespread, the concept of quality as a body of principles applicable to the production and the delivery of services has dramatically transformed over the last quarter of the last century, to become a relative concept that broadly correspond to product suitability.

In this perspective, quality is always relative to needs, and there is no such thing as absolute quality with different jobs having different quality criteria because the texts are meeting different needs.

Quality is also about customer satisfaction, work efficiency, team working, control and communication.

The pragmatic approach in translation studies has helped considering translation as the product of a process depending on specific expectations and needs of the target audience, in its function in a given context or situation.

Functionalism makes the traditional notion of linguistic equivalence obsolete, being more appropriate that of functional equivalence.

In this perspective, the client decides whether a project has been carried out properly. Consequently, a complete understanding of the buyer’s requirements is necessary for any given project.

Requirements Defined

The last few years have seen the beginnings of a trend toward standardizing procedures for the contractual relationship between the client and the service provider. The idea is that if certain procedures are followed in producing the translation that will increase the likelihood of good quality.

In fact, the fundamental assumption in quality standards (namely ISO 9000’s) is that business processes can be improved up to have the product pass as it is.

General criteria are necessary to standardize the production process and appraise quality in the sense of the product’s qualification to meet requirements.

For business processes to produce the expected outcomes, all the following elements are necessary:

  • Basic skills for task completion;
  • Appropriate and correct information about the job;
  • Accurate and suitable tools and materials to fulfill each task;
  • A well-suited environment.

When these elements are all available, their effectiveness can be measured and possibly improved, controls can be reduced to a minimum and savings will be at least equal to the planting cost of the whole system.

Specification of Requirements

When dealing with quality, two basic principles must be acknowledged:

  1. Quality is relative: people perceive different quality levels in the same product;
  2. Quality levels are subject to constraints in requirements.

A specification of requirements is a document providing an adequate and unambiguous description of the task load for a project, together with a description of the desired results, the essential conditions to which the service must conform and the characteristics or features of each deliverable.

Quality is essentially conformity to requirements that come primarily from the client’s needs. In other words, what the client says is quality is quality, even though meeting the requirement does not necessarily mean high quality: one could also meet all requirements and still produce junk.

In a mass-production environment, most products meet most of demands, but leave many real desires unfulfilled. There is plenty of choices, but almost none precisely matches expectations, so buyers are used to settle for less, but do not stop wanting something else.

Common offering is for goods and services to perform tasks or meet needs. But if the client’s expectations were actually delved into, they would be discovered to deal with transformation. Clients expect the things they buy to make them different: what is pretty obvious with personal items is just as true for business decisions.

This deep desire only tends to emerge after needs are met. Understanding and satisfying this desire creates loyalty, and client’s loyalty is perhaps the most important element in any product’s long-term success.

Most ‘quality problems’ in translation have little to do with mistakes, and more with a mismatch of assumptions and goals between the people requesting a translation and the people supplying it. Anyway, it is not always a straightforward task to gather requirements from the user. On the other hand, if you can’t collect requirements you don’t know your client, and if you don’t know your client you can hardly please him.

Simply stated this means that if a translation cannot be used to accomplish the task it was required for, it has no real use and belongs at the bottom of the cat box.

This is why academic disputes are useless in current practice: no client will be willing to spend any time to get deep into them.

The key to quality translation is really the ability to successfully negotiate between competing demands to find the translation that fits a particular situation and represents the best trade-off between requirements that cannot all be simultaneously met.

The name of the European quality standard for translation services EN 15038:2006 reads “Translation services – Service requirements”, and its purpose is to establish and define the requirements for the provision of quality translation services. Admittedly, a key issue is quality assurance and the ability to trace its progress.

Nevertheless, despite its efforts, the Italian delegation did not succeed in having a commitment towards service level agreements (SLA’s) and metrics be included in the final draft.

A service-level agreement is a contract between a service provider and a buyer/user of that service (the client) that specifies the level of service that is expected during the term of their agreement. It also defines the terms of the provider’s responsibility to the client and either the type and extent of remuneration if those responsibilities are met or the extent of penalty if they are not met.

The lack of a specification of any translation quality metrics is a serious vulnus when assessing the process of a translation service provider willing for certification to the new standard.

Anyway, in 5.2.3 Linguistic aspects, the CEN standard requires

“that information about any specific linguistic requirements in relation to the translation project is registered. Such information can include requirements of compliance with a client style guide, adaptation of the translation to the agreed target group, purpose and/or final use, use of existing terminology, and updating of glossaries.”

Different types of documentation need different quality requirements. Owner’s handbooks need to read beautifully as well as being technically correct. There are people who actually read them, strange as it may sound. Workshop/repair manuals need to be technically correct, but style is not important as long as it’s understandable. Most service technicians will only look up the procedure they are interested in and they only need to understand the steps they need to do. A mistranslation that causes the reader to misunderstand or carry out an operation incorrectly is a serious mistake, a fail. A stylistic error in a workshop manual is a minor error, but a more serious error in an owner’s handbook. This is what the expression “fit for purpose” essentially means and explains why different metrics should be used for different types of texts.


Metrics are a set of rules that allow users to measure how much a product (the translation) meets requirements and are generally used to measure performance. The primary goal of measuring, of course, is to create a standard against which something can be judged. What’s often forgotten is that metrics can be used not only to measure performance, but also to identify specific problems that are affecting performance.

Even in the language industry, operators live and die by metrics. Every step of a process is carefully measured, whether it is price, word counts, engineering hours, number of pages, or percentages of leverage. Costs and resources are allocated precisely to match those metrics, and the business case for language services must be substantiated through objective and verifiable metrics.

Long before Heisenberg developed his uncertainty principle, it was well known that the act of measuring influenced the system being measured. Also, measuring serves little purpose if it provides no means for improvement. Therefore, when developing a metric the aspects of quality everyone will work to improve must be defined.

Effective metrics must be objective (measurable), unbiased, and able to provide enough resolution (detail) to assess the factors that need improvement. This means that any two people who set out to calculate the value of a metric must be able to produce comparable results.

Typical metrics are SAE J2450, recently elevated to standard, whose goal is just to provide “a tangible method for measuring the quality of translation deliverables as precisely as for any manufactured product.”

SAE J2450 provides for severe and minor occurrences of wrong terms (glossary violation or conflict with de facto standard translations), syntactic errors, omissions, word structure or agreement errors, misspelling, punctuation errors, and any linguistic errors related to the target language which are not clearly attributable to the other categories.

Subjective metrics are hard to measure because their value depends as much on opinion as on demonstrable facts. Translation quality can be a typical case of subjective assessment as all translation are prone to subjective influences due to the subjective conditions of the hermeneutic process and the translator’s personality, and reviewers and editors are subject to the same influences.

On the other hand, quality is always a very personal issue, a relative matter. Perception is everything. This also explains why translation quality is a long-debated subject causing fierce and divisive disputes between those who claim that the only key to a ‘quality’ translation is some sort of certification or accreditation scheme for translators based on academic qualifications—or equivalent—and generally combined with membership of a ‘professional’ organization, and those who argue that consistent and acceptable translation output quality can be achieved most effectively through quality-oriented process design and standardization, possibly supported by common standards.

The first argument is increasingly suspected to be based on the desire to limit access to the profession to an elect group of ‘professionals’ meeting criteria which they themselves have devised. The second argument is debated to be flawed by the impossibility of any metrics of quality assessment due to the substantial amount of craftsmanship, creativity, and subjectivity in any translation.

Not surprisingly, the ivory-tower conception of translation in a midway between science (translation science) and art-form produces thousands of ‘graduate translators’ emerging onto the market every year, most of whom are quite unprepared for the harshness of an increasingly savage competition, but confident in their in-built superiority and ability to provide ‘perfect’ translations.

If translation is a science, however, translation assessment should be as well. Words are like stones, but translation theorists seem to deliberately forget this long-life simple principle. How much can Galileo’s principles on experience be applied to translation?

In the user’s perspective, the assessment of a translated text should be done regardless of its nature, and the translated text should be considered exactly as a primary text. Should another approach be considered as valid just because translators are so fond of themselves and of their job?

Indeed, in some respect this attitude seems due to the frustration of doing a job that is so poorly appreciated both in social and economic terms, as translation is undoubtedly one of the least remunerated jobs that can be offered to any individual with specific cultural requisites.

On the other hand, translators don’t like being told about their errors. This idiosyncrasy can be put down to the human nature, often hostile to criticisms, and to the importance that translators give to their job for the mental effort that they lavish or pretend to lavish on it.

Nonetheless, this attitude is prejudicial for an objective approach, which is often deemed as impossible.

Translation Quality Assessment

The definition of quality as stated in ISO 8402:1994, 3.1 reads: “the totality of features and characteristics of a product or service that bear on its ability to satisfy stated or implied needs.”

Quality is also defined as an integration of the features and characteristics which determine the extent to which output satisfies the client’s needs.

Needs are not just those stated but also those implied. The most important implied need in translation is accuracy. People who use the services of translators don’t ask for an accurate translation; they just assume that it will be accurate. Another implied need is successful communication of the text’s message to the readers.

Both definitions implicitly depict the client as the best judge of the quality of a translation. Therefore, a translation is—supposedly—of adequate quality if the client does not complain about it, but this is a very weak argument, and indeed an unethical one evading the professional responsibilities of translators and revisers. On the contrary, most obviously, few clients are in the position to knowledgeably assess a translation.

In fact, the above statement is true as long as the client has the capacity of dictate strict requirements for the service. The actual requirements—stated or implied—play a central role. These will eventually be expressed in terms of attributes.

Quality control and quality assessment are contributions to quality assurance.

Quality assurance is a planned and systematic pattern of all actions necessary to provide adequate confidence that the item or product conforms to established technical requirements. Quality assurance covers all activities, in accordance with two basic rules: “fit for purpose” and “do it right the first time”.

In translation, quality assurance is the full set of procedures applied before, during and after the translation production process, by all members of a translating organization, to ensure that quality objectives important to clients are being met.

Quality assessment is intended for establishing whether contract conditions have been met. Whereas quality control is text-oriented and customer-oriented, quality assessment is business-oriented.

Unlike quality control, which always occurs before the translation is delivered to the client, quality assessment may take place after delivery. Assessment is not part of the translation production process. It consists in identifying—but not correcting—problems in one or more randomly selected passages of a text in order to determine the degree to which it meets the agreed standards.

In ISO 8402:1994, 3.21 defect is defined as the non-fulfillment of intended usage requirements.

The refusal to introduce SLA’s and metrics in the prEN-15038 European standard draft lies on the belief that generally speaking the clients of a translation service do not have the necessary skills and competences to drive the provision of service through requirements and that, in effect, they rely on the service provider to deliver a certain degree of intrinsic quality.

The refusal of metrics is just a direct consequence, as there are virtually no tools available to validate compliance to standards—however unstated.

Nevertheless, since there is no ‘perfect’ translation, the intended purpose of a translation and its suitability remain the only judgment criteria which, for the sake of objectivity, must be accompanied by assessment metrics. The combination of process and output quality assessment of translation work will eventually tell simply whether it is acceptable or defective.

Therefore, translation quality assessment (TQA) criteria are to be agreed upon with the client, be subject of requirements and be formalized in a separate document.

So far TQA has been performed on the basis of a strict correspondence between source and target texts and on intensive error detection and analysis. While this is undoubtedly the best approach from a theoretical—and maybe pedagogical—point of view, it is absolutely uneconomic as it requires a considerable investment in human resources and in time, and reduces translation to a matter of trust, which unfortunately is the case, since no technical translator trained by current university teaching methods and programs is properly prepared to meet different quality criteria.

Assuming that it is impossible to set objective ‘aesthetic’ parameters for quality translations, it is quicker and easier to formulate a generally negative judgment based on whether proper equivalence of signs exists between the source and the target texts.

Conversely, when a client—or a reviser/reviewer—refuses or dislikes a translation, three steps should be made, which probably were not done before taking on the job:

  1. Arrive at a full understanding of the linguistic quality expectations (requirements) of the client;
  2. Agree with the customer on a process to correct any deviations from requirements;
  3. Implement a process to prevent the same issues in the future.

Basically, linguistic quality consists of five components:

  1. Correctness;
  2. Completeness;
  3. Meaning;
  4. Terminology;
  5. Style.

Meaning can be traced for comparison: translation is supposed to allow its user to perform the same task as the original piece of text, which is almost impossible when the meaning of the two is different.

This roughly explains why style is much too often the prime cause of dissatisfaction with a translation. On the other hand, every translator makes his own choices that become apparent in any deviations from the source text; a poor translator is not the one with a questionable style, but with no style at all.

Terminology is the second as translators unfortunately do tend to switch terms even if they have been instructed not to. In fact, many translators follow a code of creativity that might read as follows:

  1. I can write it better;
  2. If I can find a better term than the existing one, I will use it.

In reality, who will go over 100,000 words of translation to check for terminology changes after the translations have been delivered? However, if terminology issues can be approached in a systematic way, style is a matter of personal preferences. The same goes for correctness and meaning with respect to completeness. While any translation can be roughly checked for comprehensiveness with the source text, grammar, spelling, punctuation, etc. (correctness as conforming to an approved or conventional standard, freedom from fault or error) require a specific knowledge. On the other hand, they are quite often given for granted when the job is done by a professional translator.

A detailed statement of work and an accurate style guide can be helpful—although time consuming—in most situations, possibly together with examples of do’s and don’ts.

Anyway, especially for large projects, translation should and could now be considered as a production process, by the same standards of common business. In this perspective, defects as such should positively be reproduced in the same conditions, corrected and then removed.

This approach would eventually lead to set defect tracking and assessment procedures, thus to pass/fail criteria for sample testing.


Sir William Thompson, first Baron Kelvin, in his lecture to the Institution of Civil Engineers of May 3, 1883 stated:

“When you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot express it in numbers, your knowledge is of a meager and unsatisfactory kind; it may be the beginning of knowledge, but you have scarcely in your thoughts advanced to the state of science.”

In the language industry quality is a most debated subject: it is part of daily conversation. The cursed triangle of time, pricing, and turnaround time seems to take up the whole agenda.

The most commonly-asked question about quality is: how can quality be measured? To measure something, you must know what it is, and then you must develop metrics that measure it.

Metrics definition is the hardest part for people who have always thought of quality in their deliverables as a questionable subject.

The best way to assess quality remains that of measuring the number and magnitude of defects; and when defects cannot be physically removed, their features and scope must be specified. In this respect, translation quality can obviously be assessed by comparison with the source text, but if a flawed translation is quite easy to detect, at least in terms of its ‘suitability of purpose’, the quality of a fair or good translation will often be called into question by external factors such as personal taste.

The first step, then, is to establish a model or definition of quality, and translate it into a set of metrics that measure each of the elements of quality in it. Measuring things just because they can be measured is not useful. If something is not relevant to the quality model established, it is not a good use of time to develop metrics to measure it.

Striving for a single, all-encompassing metric is not only troublesome, it can be useless as a simple metric would not reveal all the problems. Creating multiple metrics that assess the various aspects of what is to be measured can help re-compose the overall framework: knowing which parts of a process work well and which ones don’t allows taking measures to correct the problems.

A comprehensive set of metrics must measure quality from several perspectives and at several points during the production process, regardless of the quality model. At a minimum, metrics should tell something about:

  • Quality of the finished product;
  • Lack of quality of the finished product;
  • Quality of the process—how reliable it is to produce quality products;
  • Likelihood of achieving quality in this deliverable (predictors of quality).

The quality of the finished product corresponds to general customer satisfaction ratings, while the lack of quality can be given by defects such as technical errors; the quality of process comes from repeatability and typical predictors of quality are in-process indicators such as editing.

Levels of translation quality can be described at least in the following terms:

  • Discard;
  • Raw;
  • Standard;
  • Finishing;
  • Adaptation.

Raw translation means a translation which conveys the central meaning of the original text. There may be grammatical errors and misspellings, but the text has to be understandable. Typically, this could be translations of large amounts of scientific abstracts.

Standard translation corresponds roughly to the translations of antiquity. The original text is translated fully and the translated text is grammatically correct and reasonably fluent. The text may be awkward at times, but the contents of the original text should be understood completely from the translation. Typically, this could be a translation of a technical manual.

Finishing translation implies that the translated text is both fluent and idiomatic, and could be assimilated completely to the cultural context of the target language. One should not be able to recognize the translated text as a translation. Typically, this could be an advertisement brochure or a piece of literature.

Adaptation is not actually the direct translation of text but the production of new text based on foreign language original(s). The resultant text need not have to correspond sentence by sentence to the originals, but may instead even have omissions or re-orderings according to what the translator deems appropriate. The resultant text is expected to be fluent language.

Most quality components can be clearly described and precisely verified. Again, what makes language so elusive is its subjective nature, thus having individual habits and preferences far outweigh academic considerations for all practical purposes. People can become extremely passionate about their preferences, down to endless revision rounds and pointless debate, and translation providers cannot really guarantee linguistic quality without input from the people who will ultimately judge this quality.

In other words, to have firm control over linguistic quality, the relationship between producers and users, the rules of engagement, must be defined, implemented and followed.

Rules of engagement

Because quality is so subjective, and its definition is such a relative thing, developing quality specifications for each new project is a good method for clearly setting quality parameters.

However, determining the accuracy of a text is a highly intellectual and even creative skill, and the client rarely has the knowledge of quality necessary to lay down specs by allocating resources to produce what he really will be happy with in the end. Therefore, going beyond the client’s requirements to produce what is wisely deemed of high quality always implies allocating one’s own resources.

Translation quality should be tracked from different perspectives: number of reviews and time spent on each of them, number of errors found, productivity, and suitability.

Being able to track translation defects is not only an important condition for delivering high-quality services to clients, it also provides an efficient way to evaluate vendor performance.

In fact, the reasons behind errors (why they happen) are separate from the measurement of errors and pertain to quality assurance and improvement rather than to quality control.

A process that demands multiple reviews will certainly tend to produce more accuracy than one that does not, but in the end it will prove too costly to be satisfactory, while in a quantitative vision efficiency is pivotal and is expressed in a relationship between the outcome and the resources to achieve it. In other words, resources must be proportioned to goals.

In an academic perspective, a correct translation is a translation with no errors; in a practice-oriented perspective, a correct translation is a translation where total error points result in a quality index above a desired threshold. Therefore one way to judge whether TQA on a project is complete is to measure translation defect density.

When dealing with TQA, a tool should be available to track any potential issues in a translation and guide the user in judging whether or not these issues are actual, and in deciding whether to take any corrective actions. For a TQA tool whatsoever to work, explicit—and reliable—assessment criteria are required together with sampling rules for the extraction of representative allotments where the entire project text is unsuitable for size and/or complexity for a comprehensive quality control.


Sampling is a statistical procedure for accepting or rejecting a batch of merchandise or documents through the determination of the maximum number of defects discovered in a sample before the entire batch is rejected.

For an object to be measurable it needs to be apportioned in definite allotments to be homogeneous in size and scope for a reasonable estimate in the number and significance of defects to set a limit for both. Statistical sampling can be used to determine acceptability provided that acceptability criteria for inspection by attributes are set. The ISO 2859 series of standards can here be used as a reference.

Acceptance sampling is an important field of statistical quality control originally applied by the U.S. armed forces to the testing of bullets during World War II.

In acceptance sampling a sample is picked at random from the lot, and on the basis of information yielded a decision is made either to accept or reject the lot. Acceptance sampling is the middle-of-the-road approach between no inspection and 100% inspection. Its main purpose is to decide whether the lot is acceptable, not to estimate its quality, and it should be employed when:

  • 100% inspection is too costly or takes too long;
  • Time or technology limitations are constraints;
  • Lot sizes are very large and the probability of inspection errors is high;
  • Supplier’s quality history is good enough to justify less than 100% inspection;
  • Potential liability risks are high enough to warrant some form of continuous monitoring.

For acceptance sampling to be effective a lot acceptance sampling plan (LASP) must be implemented indicating the conditions for acceptance or rejection of the lot that is being inspected. These parameters are usually the number of different defectives in a sample and should vary in quantity and severity in direct relation to the importance of the characteristics inspected.

Average Outgoing Quality (AOQ) procedures are the best suited for small translation projects, since sampling is non-destructive, lots are 100% inspected and all defectives in rejected lots are replaced with good units. In this case, all rejected lots are made perfect and the only defects left are those in lots that were accepted. AOQ expresses the average nonconforming fraction that is shipped to clients—bad items are discarded but are not replaced with good ones:

where PA is the probability of accepting the lot, (N-n)PA is the number of pieces that are shipped without inspection, and p is the nonconforming fraction. The numerator is the number of bad pieces that are shipped, and the denominator is the total pieces shipped.

To make assessment criteria, methods and tools unambiguous AQL’s (Acceptance Quality Levels) can be used allowing for tolerance and deviations (errors). AQL’s should be agreed upon in a SLA and would specify the maximal percentage of non-conforming items to be considered as a satisfying process mean. Different AQL’s may be designated for different types of defects. Usually, an AQL of 1% is used for major defects, and 2.5% for minor defects.

An implication of acceptance sampling is that a lot exceeding a given percentage of deviations from the AQL is unsatisfactory and must be rejected. At the same time, a high defect level (Lot Tolerance Percentage Defective, LTPD) must be designated that would be unacceptable to the consumer.

AQL’s imply a level of non-quality exists in a product where defects remain that ruin a batch, despite being “acceptable”. This level represents a compromise between quality, quantity and price negotiated, even when—as this is the case of translation—supply exceeds demand and so the client should be allow to receive a flawless (no-defect) product.

To set AQL’s, a simple defect prediction technique can be implemented to separate the defects found in a translation sample in two groups. Depending on the number of defects found in either of the two groups—but not in both—the defects that have not been found in the sample can then be estimated. This number gives approximately the number of defects in the entire project.

The Canadian federal government’s Translation Bureau developed a complex system (SICAL, Système canadien d’appréciation de la qualité linguistique, Canadian Language Quality Assessment System), to assess 400-word chunks of translations from contractors. SICAL is based on sampling and a grading scale from A (superior) to D, depending on the number of major and minor errors. The Bureau’s goal is to deliver translations at levels A and B of the SICAL standard.

In the Translation Bureau’s model, TQA is not confined to the analysis of sample translations to evaluate the translator’s skills and decide whether to contract him; TQA is not a once-for-all task, una tantum, it is a routine being part of the production process.

SICAL surreptitiously allows the Translation Bureau to decide whether to penalize contractors financially, thus partially recovering from costs through discrete remunerations by pre-defined AQL’s: a lower AQL gets a lower fee.

To calibrate a translation quality measurement tool or process, defects (errors) can deliberately be seeded in a translation to be controlled. The ratio of the seeded defects found to the total number of defects seeded provides a rough estimate of the total number of translation defects yet to be found. It will then be possible to estimate what percentage of errors is not discovered, and the variance in assessing the errors discovered.

Among the many erroneous assumptions on quality, uncertainty in control is probably the most impeding. Yet, a certain degree of ambiguity is obvious if assessment goals and criteria are not explicit and objective. This is why in the language industry quality control is often confused with quality assurance to embrace editing. But time and money spent on quality assurance are far more than those for the translation itself.

In addition, a fully-fledged quality assurance process cannot do without inspections and auditing, as quality is not the result of assessment and control procedures that can lead only to the removal of defective products. Quality is a derivative property.

In this perspective, it is not that hard to produce exactly what is requested when assessment criteria are known.

Quality Standards

The idea that quality can only be assessed against a set of specification and requirements was introduced with ISO 9000 quality standards.

Since then quality has been meaning ‘suitability for purpose’ and a quality system should be designed to specify expected and achievable quality levels, and be capable of generating a set of reports to detect deviations from a predetermined model.

Quality standards generally pertain to processes, to allow the customers of a certified company receive the required goods or services in accordance with the agreed terms. Therefore requirements are pivotal for measuring quality after specific auditing, testing, and inspections on distinctive and homogeneous samples.

Unfortunately, translation is rarely taught, and indeed thought of, as a repetitive and reproducible process, thus making auditing or inspection virtual tasks.

Therefore, to ensure quality, translation requirements must be both explicit and implicit. In the first case, quality level must be agreed with the client on the basis of measurable parameters. On the other hand, the only measurable parameter in implicit requirements is suitability, corresponding to communication effectiveness which is determined, in turn, by correctness and functionality.

The Four Rules of Quality

In Peter Drucker’s words

“Quality in a product or service is not what the supplier puts in. It is what the customer gets out and is willing to pay for. A product is not quality because it is hard to make and costs a lot of money, as manufacturers typically believe. This is incompetence. Customers pay only for what is of use to them and gives them value. Nothing else constitutes quality.”

Offering a better-than-acceptable level of quality without missing any deadlines, but at a reduced cost, requires a considerable process innovation.

Studies on evaluation techniques, standards to distinguish between severe and minor mistakes, and attempts to define what constitutes a good-quality translation have been argued and disputed by many scholars and industry professionals.

Quality is the responsibility of everyone in the organization and not exclusively that of the quality department, and quality improvement, contrary to traditional belief, has a cost-reducing effect. Doing it right the first time may require an initial investment, but the impact in the long term generates many advantages outside the limited framework of quality.

Quality systems hinge on four basic rules:

  1. Write down what you do;
  2. Do what you have written;
  3. Substantiate what you have done;
  4. Reflect on how to improve it.

In this view, quality is an endless work cycle; a cycle where deliverables are analyzed, proposed, developed, and delivered, then once again analyzed and elevated; a cycle of constant listening, observing, and quantifying, which will be refined and improved, producing products more responsive to the needs of the users while meeting the client’s expectations.

Therefore, quality must be planned into a project and managed over the project life. Ensuring quality means accounting for the time for reviews into the project plan. It means taking the time to assess the needs of the user and setting aside the time to meet and come to agreement on how quality will be measured and who will measure it.

For a quality system to work, processes must be settled and described according to the principles and criteria of the standards. However, this is the main hindrance in implementing quality standards.

Nevertheless, in most cases, the path to certification leads to inefficiency awareness and, after proper adjustments, to considerable process improvements as requirements must be thoroughly defined and detailed at each stage, while the system must be set up to ensure meeting any of them.

Quality Is Money

Only by playing the game according to the financial rules any of the daily battles over budgets can be expected to be won. Corporate financial decision makers know little—and care nothing—about “intangibles”; their thumbs-up or thumbs-down is based on things they can measure, like money.

Value can be defined as the benefit of an activity minus its cost. When both benefit and cost of translation can be expressed in monetary terms, a monetary value can be calculated.

A cost figure obtained through careful benchmarking can be used with greater confidence than a rough estimate of time and materials, and as long as benchmark costs are not known, translation will continue to be regarded simply as an expense rather than an investment.

A central issue in translation is the trade-off between time and quality. There is no getting around the fact that quality takes time. Achieving accuracy in particular is time-consuming. From an economic point of view, time is money, and the faster a translation is completed, the better.

Costs can be calculated only when tasks are reliable and repeatable, and can be used to show the value added by quality. Measuring value added by translation means measuring the total value returned minus that cost. This value can be measured by measuring the change in value (the dependent variable) caused by a change in quality (the independent variable).

To the user, the cost of poor quality is in the waste of time and effort inaccurate or unusable translations incur; to the client, it is in extra support time and the immense cost of revising translations. The greatest value-add of good quality translations, of course, is in increased customer satisfaction and the sales that is likely to bring, both from the customer and from others who hear about the translator’s performance.

During the first international conference on specialized translation in Barcelona in March 2000, Salvador Aparicio i Paradell illustrated the following formula to calculate the real cost of a translation:

where q = quotation, t = translation, e = error rate, r = revision and a = accessories.

To guarantee quality standards, successful methods must be repeated and extended across projects, goals must be set, benchmarks must be established, records must be kept, and results must be assessed.

The value of effective communication is most frequently measured in the negative, that is, only if there are problems with effective communication figures can be drawn that denote the extent of the problem. In the worst case this negative example could be a lawsuit in which a client asks for reimbursement of several million euros or dollars because the handling of a machine according to the documentation has led to severe damages.

In localization, translation quality cannot be narrowed to linguistic properties (attributes). For example, in Windows XP the dial-up interface prompts the user with the following box “Verifying username and password…” (34 characters). In the Italian version, this became “Verifica della password e del nome utente in corso…” (53 characters, +18%), but the string is truncated:

Again, when recovering after the data file has not been closed properly, the Outlook XP interface prompts the user with the following box:

In both cases, the translation is linguistically acceptable—even though a better choice could have been found for ‘left’ rather than ‘rimanenti’—but the presentation impairs it. Not surprisingly, a common element in the diffidence of the general public towards open source software (OSS) is reportedly localization quality as performed by amateurs rather than by professional translators, making OSS only for computer geeks.

Curiously, the following message box does not seem to have any negative effects on users, as ‘incorretta’ is perceived as a minor inaccuracy and is skipped, possibly with a grin for the money spent on a product that was supposed to be of superior quality.

Customer satisfaction

Finally, customer satisfaction is the other side of the coin. It is the engine and the drive of quality.

Customer satisfaction can be measured in the client’s or in the service provider’s perspective.

In the client’s perspective, the grades of reaction in front of an even partially unsatisfactory service are the following:

  1. Disappointment: the client does not get what he really wanted;
  2. Allowance: the client accepts a product whose quality is lower than expected;
  3. Trade-off: the client adjusts his expectations;
  4. Settlement: client’s needs are met, but desires are not;
  5. Tuning: the client changes his behavior to match the offering.

In the service provider’s perspective, the same scale steps through the following grades:

  1. Fulfillment: the provider meets his client’s expectations by giving him what he asked for;
  2. Satisfaction: client’s expectations have been met;
  3. Efficiency: typical offering has met client’s expectations;
  4. Equalization: operating efficiency is improved by leveling offering;
  5. Massification: clients are trained to ask for what is offered.

Quality is always listed as the highest priority over deadlines, cost, and customer service. Nevertheless, trustworthiness is fundamental as most clients typically use only one vendor, and little time, and money, is allocated to translation activities.

For customer satisfaction to be measured in translation services, the relevant attributes must be determined such as confidence, courtesy, friendliness, responsiveness, complaint handling, and reliability.

To trace customer satisfaction a regular survey is necessary that provides a statistical measurement of inbound and outbound deviations from the negotiated service level reported by clients. Tracked over time this is a reliable quality index, and can be associated with in-process metrics to measure the effectiveness of reviews and the process over time.

On the other hand, translation is an intangible service, circumstances of execution are always different, and many factors can negatively impact the value of customer satisfaction and eventually bring bias to the results of surveys that can undermine a vendor’s effectiveness.

Therefore, it is necessary to guard against excess: the effort put into achieving customer satisfaction is sometimes extreme, even counterproductive, because some of the expectations ascribed to the client have not been confirmed by any analysis. In these cases, there is a major risk of focusing on issues that the client may be unaware of and are immaterial while leaving real issues unresolved and actual expectations unsatisfied.

The Canadian federal government’s Translation Bureau admits that the ultimate test of the quality of a translation is client satisfaction. To measure client satisfaction and quality of translations the Translation Bureau implemented a “Continuous Evaluation System” based on sampling and periodic surveys.

A survey, by its nature, cannot measure the emotional feeling toward an intangible service, and even with a standard set of rules, judgments will be different as interviewees will naturally base their feelings on different projects, which were done by different teams in different locales under different conditions. This also means that the wider the sample base the more inconsistent the results of the survey will be, clashing with the fundamentals of statistics, a science where accuracy relates largely to the size of the sample. The smaller the sample size, the greater the bias.

In addition, a deeply unhappy client who is not in a long-term relationship with a vendor that is determined to preserve will find uneconomic to report the vendor with its dissatisfaction and will likely choose a new vendor right away. Also, clients tend to remember, and report, only major problems, which weigh heavily on overall satisfaction.

Finally, as Jeffrey Gitomer, the sales guru, put it,

“Boasting about a near-perfect customer-satisfaction rating of 97.5 percent is a major mistake. That means 2.5 percent of your customers are mad, and they’re telling everyone. And 97.5 percent of your customers will shop anyplace the next time they go to market for your product or service.”

When running a customer satisfaction survey, even though virtually all clients are satisfied, they can eventually go for a competitor whom they also find satisfactory. Therefore, in creating customer satisfaction surveys, questions should be asked about expectations along with satisfaction.

Measurements could then help predict the quality of the final completed product before actual completion. In-process metrics must be developed by watching trends over time and correlating these trends with final quality.

This process of continuous improvement is called kaizen from the Japanese management concept for incremental adjustments introduced by Taiichi Ohno who was the assembly manager for Toyota in the 1940’s and early 1950’s, and developed many improvements that eventually became the Toyota Production System.


In the 1950’s, after Deming’s teaching on Statistical Process Control (SPC), the Japanese Toyota engineer Taiichi Ohno developed the Toyota Production System (TPS). The TPS was primarily based on Deming’s paradigm for quality management, the PDCA (Plan-Do-Check-Act) approach.

In the “plan” stage, the objectives and processes necessary to deliver results in accordance with the specifications are established; in the “do” stage the processes are implemented; in the “check” stage, the processes and results are monitored and evaluated against objectives and specifications, and the outcome is reported; finally, in the “act” stage, actions are applied to the outcome for necessary improvement.

Ohno flavored it with the Japanese taste of kaizen. The kaizen method of continuous incremental improvements is based on traditional Japanese philosophy, assuming that every aspect of our life deserves to be constantly improved. Kaizen literally means change (kai) to become good (zen).

When applied to the workplace kaizen means continuing improvement involving everyone in an organization working together to make improvements ‘without large capital investments’. The focus is on eliminating waste in all systems and processes of an organization.

The key elements in the kaizen strategy are the willingness to change, never-ending efforts for improvement, and communication.

Quality improvement and cost reduction are, in fact, compatible since quality is the responsibility of everyone in the organization and not exclusively that of the quality department. This means that everyone involved in a project should monitor the quality at every stage of the process.

Organized kaizen activities lead to the TQM (Total Quality Management) approach for improving performance.

Although TQM principles are, in fact, the foundations of the ISO 9000 series of standards, in 2000, amidst complaints of ISO 9000 undermining world-class thinking, Toyota moved back to the TPS.

Incremental improvements have a cost-reducing effect. The long-term impact of the doing-it-right-the-first-time philosophy generates many advantages outside the limited framework of quality despite of an initial investment.

Traditionally, in order to verify the quality of a translation, a revision by a second translator is carried out, a practice that is certainly costly and time-consuming, especially because this work has traditionally been performed by senior translators.

Eliminating most of the repetitive, measurable and predictable (formal) mistakes in advance would considerably reduce the time required for proofreading and correction work afterward, and what is measurable is also traceable.

Therefore, the clever project manager’s motto should be “deliver quality on time and within budget” and this goal can be achieved only through a combination of people, process, and technology.

The Teacher’s Role

Students should be taught to devise and implement an overall project strategy. A project strategy makes translation requirements easier to collect and understand and even apparent, although they are not.

The lack of standards, numbers, or ratios of quality make ambiguity arise since students, as future translators, are expected to deliver quality from the start, but will hardly find somebody capable of defining it. In fact, in translation classes educational goals are explicit, but how they will be pursued and monitored is left unsaid.

Teachers are then called to play the unpleasant role of editors or reviewers who, no matter how necessary his corrections may be, send a message to translators that sounds nasty to their ears: “You write poorly” or even “I write better than you do.”

All translators eventually confront with editors or reviewers, but rarely are taught to view them in a collaborative endeavor to improve their work. To do this, students must be taught to team working and fight the typical translator’s disorder, the self-referential attitude from isolation. This attitude sometimes lead editors—who possibly once were translators—to have a deserved reputation for making changes purely to demonstrate their authority. Even more frustrating is that some people gladly concede quality issues to the translator’s expertise during the project, but then turn into fierce critics after delivery.

Therefore, teachers should help students persuade themselves always to seek out an editor’s assistance to make their lives easier once they become translators. In addition, by making error pick-up, assessment and editing criteria explicit the teacher can help students reduce subjectivity in judgment and learn how to develop their own metrics when reviewing or editing a translation. Also, telling a student what the teacher expects to receive from them corresponds to enunciate requirements and to explicit metrics, thus making assessment transparent.

Finally, translation course generally lack an ‘economic’ approach with the associated investigation of the cost of errors, thus eluding the problem of translation sustainability.

The economic sustainability of a translation must be valid for the client as well as the translator. It is to some extent equivalent to allocation efficiency to bring about the best outcome for all people by deriving the largest possible utility from any given set of resources.

Pricing strategies are crucial in this respect as different requirements/jobs with different AQL’s call for different offers. Also, tools are increasingly spreading that reduce source content to reduce translation costs, therefore students must be taught to take full advantage of appropriate technology to improve efficiency, use of resources, costs, and guarantee economic sustainability by standardization and large-scale use, reliability, and affordability.

The Market’s Role

In most cases translation does not pertain to the core business of the client, who therefore considers it to be a non-critical purchase. Combined with the complexity of the supply market, the increasing competition, and a more professional purchasing behavior, all this results in the perception of translation as a commodity. One indicator is the practice of auctioning for the assignment of translation projects.

Translation is often at the end of a supply chain where all parties assume the preceding ones have done their task at the best of their efforts. Therefore, even though translation might not be considered as a commodity, it is irrelevant; it is expected to be there.

On the other hand, according to the first law of socio-economics, in a hierarchical system, the rate of pay for a given task increases in inverse ratio to the unpleasantness and difficulty of the task. Therefore, vendor selection is usually based on generic business benchmarks rather than on the specific skills required to handle the translation process.

In an ideal market, suppliers do not control markets, clients do, and no client is willing to pay for poor quality products, even though customer power (pressure) is almost always exerted on prices, thus making a lack in quality a tangible element that must be taken into account when calculating a company’s profit margin.

Clients are interested in getting the lowest price possible while retaining the best service providers; conversely, vendors are motivated to get a fair price for their services and to resist price pressure.

Since auctioning is aimed at driving the price down stressing quality makes no sense when the main parameter is the discount floor. The acceptable price is the one that shows balance with the service capability, rather than with the value of the auctioned item. According to the mathematician and Nobel laureate John Nash’s theory of non-cooperative equilibrium, presuming that market players on average estimate the value of the item and their bids correctly, the winning bid produces lower than feasible or even negative profit. Winning against a number of rivals following similar bidding strategies implies that the winner’s estimate is an overestimate of the item’s value or underestimate of a feasible contract bid conditional on the event of winning.

Nobody is interested in driving the price below sustainability level because this could put the supplier’s reliability and capability to invest at risk while dropping considerable extra costs and risks on the client’s side. The extra costs involved in working with cheaper suppliers are for additional monitoring, while the extra risks concern rework and delays.

Only lower quality can be bought at a lower price; this is an old and well-known truth. A lower price often means that the supplier overestimated his capabilities, is probably working below his sustainability level and therefore has no reliability reserve or guarantee. This means that the quality could eventually be lower than expected.

Pecunia non olet, but translators, are so focused on their ‘art’ to pretend to ignore it or, worse, to forget it, even though, when translation is not just a second-job option, money is and must be a priority.

In short, quality must be proportioned to profit and translators should be taught to think of their job in terms of making a living, and not as a form of art, thus priceless per se.

In 2002, an ABI (Allied Business Intelligence) research estimated at $ 7 millions the world translation market and at $ 3.1 millions the European translation market. ABI also produced forecasts for a growth of the global translation market from $ 13 billions in 2000 to approximately $ 22.7 at the end of 2005.

According to the same estimates, the publishing industry covers less than 5% of the market. Simply stated, and regardless of roughly the 40% cut in rates, this means that literary translation does not pay. Therefore, all estimates point to a market where economic sustainability is in the interested of both parties, the translation buyer and the translator provider.

In literary translation, ‘poetic’ attributes prevail on functional features, thus exposing any assessment to severe subjective interferences. Also, in literary translation ‘historicization’ is crucial being the only acceptable filter for assessment.

Theorizing is a license to elude these questions and renege on verifications, thus justifying the otherwise factious dichotomy between ‘practitioners’ and ‘theorists’, while the first need the latter for the development and enunciation of reference models, and the latter need the first to verify correctness and relevance of their models.

And since quality is a shared effort, any impediments in the quest for quality should be tackled in partnership with the client. This will also increase the client’s understanding of why translations are priced differently for different types of manuals, and not only.

Briefly, translators should learn to speak their client’s language (business) to explain how their services are different. In quality.


  • Al-Qinai J., Translation Quality Assessment: Strategies, Parameters, and Procedures, in Meta: Journal des traducteurs, XLV, 3, 2000
  • Allied Business Intelligence, Language Translation, Localization and Globalization: World Market Forecasts, Industry Drivers and eSolutions, ABI Research, Report code RR-TRAN, 2002
  • Baker M., Quality of translation, in Encyclopedia of Translation Studies, Routledge, 1998
  • Bonthrone R., Screams in the Quality Jungle, in LISA Newsletter, Vol. V, No. 3, September 1996
  • Brunette L., Towards a Terminology for Translation Quality Assessment: A Comparison of TQA Practices, in The Translator Vol. 6, No. 2, 2000
  • Drucker P., Innovation and Entrepreneurship, Collins, 1993
  • Eckersley H., Systems for Evaluating Translation Quality, in Multilingual Computing & Technology, #47 Volume 13 Issue 3, April/May 2002
  • Gitomer J., Customer Satisfaction Is Worthless, Customer Loyalty Is Priceless: How to Make Customers Love You, Keep Them Coming Back and Tell Everyone They Know, Bard Press, 1998
  • Hönig H., Positions, Power and Practice: Functionalist Approaches and Translation Quality Assessment, Multilingual Matters, Volume 4 Number 1, 1997
  • House J., Translation quality assessment: A model revisited, Gunter Narr, 1997
  • International Organization for Standardization, ISO 2859-0:1995 Sampling procedures for inspection by attributes—Part 0: Introduction to the ISO 2859 attribute sampling system
  • International Organization for Standardization, ISO 2859-1:1999 Sampling procedures for inspection by attributes—Part 1: Sampling schemes indexed by acceptance quality limit (AQL) for lot-by-lot inspection
  • International Organization for Standardization, ISO 2859-2:1985 Sampling procedures for inspection by attributes—Part 2: Sampling plans indexed by limiting quality (LQ) for isolated lot inspection
  • International Organization for Standardization, ISO 2859-3:1991 Sampling procedures for inspection by attributes—Part 3: Skip-lot sampling procedures
  • International Organization for Standardization, ISO 2859-4:1999 Sampling procedures for inspection by attributes—Part 4: Procedures for assessment of stated quality levels
  • International Organization for Standardization, ISO 8402:1994 Quality management and quality assurance – Vocabulary
  • International Organization for Standardization, ISO 9000:2005 Quality management systems – Fundamentals and vocabulary
  • Larose R., Méthodologie de l’évaluation des traductions, in Meta: Journal des traducteurs, XLIII, 2, 1998
  • Lauscher S., Concepts of Translation Quality and Quality Assessment, in Proceedings of the 39th Annual Conference of the American Translators Association, 1998
  • Ling Koo S. & Kinds H., A Quality-Assurance Model for Large Projects, in Sprung R., Translating into Success: Cutting-Edge Strategies for Going Multilingual in a global age, John Benjamins Publishing Co., ATA Scholarly Monograph Series, 2000
  • Mossop B., Editing and Revising for Translators, St. Jerome Publishing, 2001
  • Muegge U., Translation Contract: A Standards-Based Model Solution, Authorhouse, 2005
  • Nash J., The Essential John Nash, Princeton University Press, 2001
  • Ohno T., Toyota Production System: Beyond Large-scale Production, Productivity Press, 1995
  • Picken C. (ed.), ITI Conference 7 Proceedings, Quality-Assurance, Management and Control, ITI, 1994
  • Reiss K., Translation Criticism – The Potential & Limitations. Categories and Criteria for Translation Quality Assessment, American Bible Society, 2000
  • The Society of Automotive Engineers, SAE Standard J24250 – Translation Quality Metric, SAE International, 2005
  • Schaffner C., Translation and Quality (Current Issues in Language and Society), Multilingual Matters, 1998
  • Schiaffino R. & Zearo F., Translation quality measurement in practice, 46th ATA Annual Conference Proceedings, 2005
  • Williams M., Translation Quality Assessment: An Argumentation-Centred Approach, University of Ottawa Press, 2004


Author: Luigi Muzii

Luigi Muzii