From the dawn of the AI age, we have heard a lot about how generative AI has a tendency to produce false or misleading information – often with a swaggering confidence that leaves us inclined to trust what it says. This has opened up a debate around how we should relate to such information, and where we ought to draw our red lines for the use of AI to produce and publish written materials.
For example, should we pass laws making it a requirement to label all content produced using AI so that readers can exercise appropriate source criticism? How can we teach children and young people to use these tools with caution? When should it be okay to produce informational texts via generative AI? And what checks and safeguards do we need to put in place to protect our democratic processes and integrity?
The world of translation is not immune to these issues, either. AI is increasingly used to power machine-translation tools, and this means there exists a risk that machine-translation systems may invent information to get around tricky translation problems. What this means is that anyone who produces or uses translated content needs to carefully consider the dangers that using AI can pose, and the actions we can take to mitigate them.
Below, we look at this issue in more detail and consider some of the risks involved in the overuse of automated translation solutions.
News headlines mistranslated
A study looking at multi-lingual translation systems in Facebook found that machine translation has introduced misinformation into the news feeds of users. The study looked at posts translated from English to Tamil and found that up to 20% of generic translated news headlines did not accurately reflect the meaning of the source. This figure increased to 30% in the case of sarcastic or domain-specific headlines.
The paper distinguishes between incorrect translations – which may simply be translations that do not read well or which contain syntax errors – and translations that actually contain misinformation, meaning they provide a distorted view of the facts or opinion expressed in the original. The results suggest that machine translation sometimes plays fast and loose with the truth in favour of producing something – rather than just holding its hands up and recognising its own limitations. The risk, then, is that once this false information spreads, it can be hard to contain, and it may have real-world consequences.
This problem is further compounded by the fact that post-editors are often unable to dedicate time to confirming the veracity of translation outputs. This issue is recognised by another study published in Artificial Intelligence in HCI whose authors argue that it is “unreasonable to expect post-editors (PE) to devote the equivalent levels of time and effort on the MT pre-translated text as in traditional translation projects, given that PE tasks have relatively lower pay but identical, if not tighter, deadlines.”
Translations that break the law
In some cases, misinformation or mistranslations generated by automated translation tools can even cause their originators to wind up on the wrong side of the law. Perhaps one of the most striking examples of this occurred in Thailand – a constitutional monarchy with stringent lèse-majesté laws which prohibit statements deemed offensive to the king or the Thai monarchy.
In August 2020, the Thai Public Broadcasting Service posted a seeming innocuous post about the king’s birthday on Facebook. This was then auto-translated from English and the resulting text was deemed so offensive that the Royal Thai Police launched an investigation. Facebook ended up having to deactivate its auto-translate feature in the country over the debacle. Because of the harsh penalties in place for such statements, no media outlet in Thailand has so far revealed exactly what it was the mistranslated text said.
Added toxicity
As well as inaccurate and potentially offensive information, machine translation can also interfere with the so-called toxicity of a text, according to a study by Meta. Toxicity in this context means “offensive utterances and bad sentiments” and can include things such as foul language as well as sexist, racist, ageist or homophobic statements.
As part of its efforts to develop a single machine-translation model for 200 languages on its platform, Meta has spearheaded a number of studies into machine translation. The phenomenon of added toxicity is one of the issues it has identified – namely cases when the translation output introduces elements of toxicity that were not present in the original.
According to a paper presented at the International Conference on Learning Representations, added toxicity can occur both due to mistranslations and to hallucinations. In one example, the tool appears to struggle to translate the adjective “gangly” into Catalan. Rather than choosing an appropriate word with an approximate meaning, the system goes rogue and picks an offensive word which can be back-translated as prick, asshole, or a number of other even less palatable swear words.
Let the humans do the talking
None of this is to say that machine translation has no place. Indeed, many of the examples cited above represent pure machine translation without any involvement from human reviewers. The idea, then, is not to warn against all use of machine translation, but to cast some light on the reasons why these issues occur.
At FairLoc, we believe that we all need to consider more carefully when we use human translations and when we use machine-translation post-editing. By keeping human translators in the driver’s seat, we can be sure to swerve around these issues and ensure that the translations we produce remain accurate and true to the source.
What’s more, the FairLoc stamp serves as certification that a text has been translated by a human. It therefore allows readers to rest assured that the words they are reading have been carefully considered by an intelligent writer, and even though this doesn’t mean they will be free from errors – at least they will be free from hallucinations, misinformation and added toxicity.
Click here to learn more about FairLoc.