This article was originally published on the Wordbee Blog. Wordbee are the makers of the popular translation management system and CAT Tool for translators.
Post-editing of Machine Translation (PEMT) wasn’t born yesterday. On the contrary, it’s as old as machine translation (MT) itself. And although at the moment, we have a large amount of material at our disposal about this topic, the nuances of the discussion are such that in some cases we risk losing sight of what PEMT really is.
PEMT: No Need for Creativity
Let’s start by saying that PEMT has nothing to do with revision. Nor does it require the “creativity” of, let’s say, transcreation. In spite of all the articles written on the subject and a brand-new ISO standard, up to now the most accurate definition of post-editing comes from the 2010 TAUS Post-editing in Practice report: “Post-editing is the process of improving a machine-generated translation with “a minimum of manual labour.”
The keywords in this definition are “a minimum of manual labour.”
While revision is based on a contrastive analysis of source and target texts and requires the reviser to check and edit terminology, style, and grammar, PEMT is characterized by higher productivity and limited cognitive bilingual effort. The main changes will concern mechanical errors (capitalization and punctuation), grammar errors, terminology inconsistencies (e.g. missing words), and other issues that are often the product of a poor source text and result in poor readability of the target text. A post-editor is not expected to rewrite entire sentences (unless those sentences are obvious nonsense or contain word salads), so they should only amend what’s necessary to make a sentence clearer to the reader.
The skills that distinguish a reviser from a post-editor are also different: A reviser must have a sound knowledge of both source and target languages, of translation techniques, and of a specific domain, but a post-editor, on the other hand, may even be monolingual. No matter what, though, they must have a strong knowledge of the target language and of the specific domain, and, ideally, an idea of how machine translation works.
3 PEMT Approaches in Practice
PEMT and the Enterprise
Let’s take an enterprise that has developed its own engine. The PEMT task could take place in a CAT environment or, in the case of enterprises that have their own MT engines but no translation department as such, it could be entrusted to external language service providers (LSPs).
Because in this instance we’re dealing with a customized engine, the MT output will be of high or good quality. The PEMT guidelines will be very specific and rigorously based on the error typologies produced by the engine in question. It will be necessary to indicate the level of PEMT necessary (light or full post-editing), and what the purpose of the text and the target group are. Glossaries are essential if the MT engine has just been put into use and has shown some terminological teething problems.
PEMT and the LSP
Only a few LSPs have the financial and technical resources needed to develop client-specific MT engines. Most LSPs will resort to using vertical (domain-specific) engines developed by MT technology providers and available in SaaS mode according to a pay-per-use model. The MT output will be sent to internal or external post-editors. Alternatively, post-editors might receive an API key to use a vertical MT engine in a CAT tool environment. In this specific case, post-editing becomes an interactive task.
Some LSPs will pre-translate a source text with a general MT engine, for example Google Translate or DeepL. This is a viable financial choice when starting out with MT, translating small texts or, again, facing a lack of financial/technical resources.
In this approach, because the post-editing level and goals will change from project to project or from client to client, LSPs always need to provide information about the final use of the translation and accurate guidelines on how to conduct the task. PEMT projects could be split among many post-editors: The specificity and strictness of the guidelines will ensure a certain level of consistency. It’s also important to provide a client-specific glossaryto reach a consistent use of terminology, especially in the case of a public engine.
PEMT and the Freelancer
Gone are the days of freelancers’ rage against machine translation. Nowadays, most freelancers use MT as a helpful tool that provides translation suggestions. The choice is usually Google Translate or DeepL (web version or with an API key).
There are not precise PEMT guidelines in this instance. The freelancer using one or more general MT systems is free to decide which MT tool to use, how to use it, and how much to use of the MT output. From an ethical point of view, they should inform the client about the use of a public MT engine, or in any case, ask the client if there are specific criteria that might prevent the use of public MT engines. Think medical records, legal documents (involving sensitive or personal data; in one word GDPR), and confidential or IP-protected documents.
One thing to remember: When using a public MT engine through an API key in a CAT tool environment, the segments containing MT output in some cases might be tagged with AT or MT, therefore revealing their origins.