18 March 2022

8 million words, 4 days, 1 localization company

8 million words, 4 days, 1 localization company

How many words do you think the complete works of Shakespeare contain?

20 million words? 13 million? 7 million?
Would you believe: less than a million words?
What about the Harry Potter books?

Think again.

At 1,084,170 words, all Harry Potter’s adventures amount to a few minor skirmishes with dragons and warlocks when you compare it to the Herculean task that Alpha was set a few months ago.

One of our long-term clients – a high-end German car manufacturer that you’ll probably have heard of – set us the task of translating 8.8 million words from German into American English. A breeze if you’ve got all year, but when the brief stipulates that you’ve only got four days? It’s time to throw out your DE-EN dictionary and go back to the drawing board.


Think fast, translate even faster

So how did we do it? The smart people out there will have already said “machine translation”. And that’s true, but as the song says: “It ain’t what you do, it’s the way that you do it… and that’s what gets results”.

The task was the translation of an eDiscovery project. Basically, this is the process in which electronic data is located, provided and searched in order for it to be used as evidence in legal proceedings. Nearly nine million words of densely packed corporate, business legalese. For this task you’d not only need a machine translation engine but Harry Potter’s book of spells on top.

Alpha’s solution was not one but three machine translation machine engines, which we tested and tested and tested with a tiny little batch of 13.5K words taken from the 8.8 million.

Following this, we set our linguists (actual people) loose on the tester batch, translating and QA’ing to the nth degree. Then we compared and contrasted. The machine translation engines provided the brawn and our walking talking human translators provided the brains. Together: a dynamite combo.


The electronic Kool-Aid acid test

With this massive amount of data-heavy corporate/legal content what the client wanted was to get a good gist from the translated content.

Naturally, we strive for perfection at Alpha with every word we translate, but this time we wouldn’t have any time for pre-preparation or post-editing by human translators. For this project we were aiming for “a comprehensible and accurate account of the essential meaning of the source text” (as defined by TAUS), together with a little Alpha flourish of ingenuity. In four days.

Once we’d tested our three trial machine translation engines to the limit, had our linguists go through the results with a fine toothed comb, we picked the strongest contender and started to design a workflow. (Remember: four days). This workflow had to be more than agile, it had to be acrobatic.

Working closely, and nimbly, with our client we found a solution to their pre-preparation problem. It was time to dispense with all the file niceties: no tags, no formatting, just pure, unadulterated OpenText documents. These files were provided in batches and, using another piece of wizardry on our part, fed into the machine translation engine over the four days using a project content connector especially designed by one of our super-smart operations people. This clever little gizmo was then synced with a watch folder, and managed by the aforementioned super-smart operations person (day and night, we should mention – talk about going the extra mile!) As soon as new files arrived in the watch folder, they were immediately imported into Alpha’s translation platform, machine-translated and exported to the client.

8.8 million words, four days and one tired operations expert later, the job was signed, sealed and delivered.

We’re happy to say that the project was as smooth and sleek as one of our client’s latest lux car designs. In fact, our client told us: “Thanks very much for the quick turnaround and assistance on the latest sample of machine translations. We thought the quality was excellent” (except that they said it in German).