Ancient Tablets and AI Translation

Translating between two languages is a matter of carefully constructing sentences which best convey the meaning of the original text. As such, it isn’t about a rudimentary word-per-word switch from target to source language. There is a degree of measured flexibility required, because it’s about the nuances of the piece as a whole, rather than the individual meaning of constituent parts.

This is a process which machine translation has been continuously trained to engage in, with incremental improvements over the years. In a further showcase of the power of artificial intelligence in handling natural languages, researchers have recently trained an AI to provide approximate translations of 5000-year-old Akkadian tablets.

The Challenge of Dead Languages

This is particularly impressive because the challenges associated with translation only expand when the original language is one which has been extinct for thousands of years. Such is the case with Akkadian, an early Semitic language with no daughter languages. These languages are well worth studying, if for no other reason that the insight gained into the lives, politics, and beliefs of ancient societies. An effort which sheds light on our own historical place in the world.

Conservative estimates attribute Akkadian to hundreds of thousands of texts discovered by archeologists. Many have already been digitized. And yet, only a handful of scholars can make any sort of sense of these texts. And what they can make sense of is fragmentary, much of the context lost to the degrading effects of time, despite being composed on clay, which holds up much better over centuries than something like papyrus.

It is their fragmentary nature which adds to the complexity of translation, compounded by the general lack of experts on such languages. Unfortunately, both the time and manpower required work against the effort to translate these documents.

AI Translators

To help ease that burden, a team of archeologists and computer scientists developed an AI to translate Akkadian. It works instantaneously according to a neural machine translation (NMT) model designed to specifically handle the language.

Akkadian is what is known as a “polyvalent” language, that is, the meaning of its symbols varies depending on their function within the sentence. This leaves translators with what they call a two-step process. First, they take the original script and rewrite it using the similar-sounding phonetics of the target language in a process known as transliteration. That is, they reconstruct the words using the letter of the Latin alphabet which most closely produces the word in the original language. A common example of this process is the Arabic word for God, الله, which translates to “Allah.” Once this has been done, they translate the new text into the target language.

With this in mind, the NMT has been trained to handle both the cuneiform as well as its transliteration. Using an algorithm designed to evaluate machine-translated text, known as the bilingual evaluation understudy 4 (BLEU4), the model scored a 37.47 and a 36.52 respectively. Each within the accepted range for a high-quality translation.

The NMT falls short on a number of functions. It can’t handle longer sentences very well, and is easily lost when given more “literary genres,” as opposed to a genre more “formulaic,” such as decrees and records. Shortcomings aside, it is very accurate at recognizing specific genres, which is another timesaving function.

In some cases, the model invented results which ostensibly had nothing to do with the imputed text whatsoever. The researches dubbed these as “hallucinations” on the part of the AI.

A Collaborative Effort

Where does this leave translators? For the most part, the technology serves as a useful aid, gesturing towards a quick and accurate translation. The bulk of the effort still has to come from a human translator: even if the translation passes the threshold, it has to be reviewed, edited, and even overhauled. Processes best performed, for the time being, by a human.

At present, the NMT model is accessible through an online notebook, and the source code has been made available on GitHub under the project name “Akkademia.” This is technology which is available to anyone. Working in this way, with scholars making use of such NMT models, translating the ancient world becomes significantly easier and more accessible.

Yet when it comes to modern widely spoken languages in use today, such services as those offered by Trusted Translations, which employs professional translators and linguists from around the world, will go a long way in meeting and exceeding your professional translation needs.

Photo by Bilge Şeyma Kütükoğlu at