Even if most of us can’t travel as we once did, the world is a more accessible place, at least online. Business people may not be attending international conferences or flying around the world for meetings so much. But in many respects, businesses are finding that they can access and develop foreign markets by localizing their websites and apps to speak the language of locals and adapt to their standards. Developers can play an integral role in the translation and localization process.
Google Translate API as the standard for machine translation
Developers can hop on this trend, increase the value of their software, and expand online on capabilities to their apps. Google has emerged as a leader in machine translation algorithms in the last decade, leveraging advances in AI-driven neural network tech. These powers Google Translate, well-known to businesses and consumers alike as a leading provider of reliable and cost-efficient machine translations. We’ll explore here how developers can add Google-powered translations to their apps by leveraging the powers of the company’s Translate API.
Before diving into the software weeds, it’s worth noting alternatives to the Google Translate API route. Third-party conversion tools like Zapier and IFTTT let you link your software workflow to auto-translation modules via webhooks and web services, with a minimum of coding. Even a tech-savvy non-programmer should be able to implement these solutions. The main drawback, however, is that you are likely to settle for a translation engine inferior to the one offered by Google. If you want to exploit the core linguistic superiority of its translation offerings, best to read on.
What does Google Offer for Translation?
Google is a pioneer in both machine language and machine learning – the two L-words representing two sides of the same coin. Language needs to be learned, and that learning is achievable by mastery of a natural language. AI-driven mastery these days is driven by neural machine translation (NMT), first introduced at the prestigious WMT (Workshop on Machine Translation) in 2016, bringing a “paradigm shift” in translation tech. From that year forward, NMT has been the preferred method of translating.
Back in 2006, Google started training its translation algorithm by digesting tens of millions of words extracted from translated documents of the European Union parliament and the United Nations. An article from Wired recounts how Google’s AI innovations revolutionized machine translation. Today Google confronts competition from Facebook, which is leveraging the learnings from comments and posts by its 2 billion users to translate more casual conversations, including rendering LOL and WTF in scores of languages. Neural machine translation continues to be a way to go.
Happily, the language learning process didn’t stop with bureaucratese and Emoji. Google Translate today supports over100 languages, several dozen with voice support. You can talk in one language and get the translation vocalized in another, usually with a choice of voice. And, as we will see, Machine Learning has been productized so that you can effectively translate a domain-specific language of your own.
Will AI put an end to the translation industry?
While using the Google Translate API is a cost-effective way of adding another layer of usability and accessibility for websites and apps through multilingual support, its utility is far from perfect. A report released by Tomedes on the top translation industry trends for 2020 states that AI translation is still far from replacing the human translation.
Even with the strong capabilities of Google’s translation API, it still has very glaring limitations: first, the quality of translation lowers with the introduction of more complex sentence structures, figurative language and slang or colloquialisms. And for most websites, creative copy is a prerequisite and Google Translate will definitely struggle maintaining the essence of the message when translating from one language to another. Another major limitation of Google Translate is support for more obscure language. Google would be able to translate most of the world’s major languages but it certainly won’t be able to accurately translate rare languages and dialects making truly universal accessibility an unlikely reality.
This is why it is important for developers to work with the in-house or agency content and marketing teams to create a localization strategy that would best fit their needs. If the website’s or app’s language is simple, Google Translate will suffice. But if the content requires more nuance when it comes to translation, perhaps it would be best to create a strategy built around machine translation post-editing which unites the speed and efficiency of machine translation with the quality and accuracy of human translation.
Getting Started with Google’s Translate API
Google promotes its API as fast and dynamic, adaptable to diverse content needs. The company markets not just to professional coders but to a broader spectrum of users, including those with “limited machine learning expertise” who can quickly “create high-quality, production-ready models.” Maybe not grandma, but perhaps her granddaughter.
For the latter, you can just upload translated language pairs (a structured list of words/phrases with their translations) and AutoML Translation will train a custom translation model” As reported in ProgrammableWeb, the workflow allows either customized-by-the-client or pre-trained (by Google) inputs. To translate an English product description into French, Korean, and Portuguese, for example, you could customize a Google AutoML model for French and rely on an off-the-shelf pre-trained model for Korean and Portuguese. Then you would simply upload your English HTML file to Google Cloud Storage and send a batch request to the Translation API pointing to your AutoML and the pre-trained models. Google’s AutoML Translation would then output your HTML in three separate language files to your Cloud Storage.
Training is key, but neither is there a need for diapers and baby wipes. This toddler is pre-trained to render 100+ languages. And if you have a domain-specific lexicon – medical or legal terms, for example – these require just a little more training and tweaking of the basic API – if they don’t already exist. A Glossary lets users “wrap” proprietary terminology not to be translated (like brand and product names) to ensure they stay intact during translation. There is also built-in support Media Translation API, which handles real-time, low latency streaming of audio translations.
The process is essentially 1-2-3: Upload a Language Pair. Train AutoML. Evaluate.
This translation power is not free but the pricing’s pretty painless. Typically, you’ll be using Google’s Translate API and its Media Translation API (if you need voice support). You’ll need the AutoML service only if you need to train more language pairs).
The fee for the Translate API is $20 per million characters. The Media Translation API will set you back $0.068 to $0.084 per minute. AutoML is a bit pricier, costing $45 per hour for training a language pair, to a max of $300/pair. Pay only for what you use, as you use it. Google is patient: it wants to get you hooked, so it throws in free processing as you get up to speed, with a full year to practice before needing to pay up.
Setting up for Your First Translation
The RESTful Translate API is the easiest way to get started. Google offers two ways to get set up, basic and advanced. If you’ve set up for any Google API, you probably know the drill, more or less, and may already have a Cloud Console Account. Assuming this is true, the next things you need to do, if you haven’t already, are:
- Create or select your project.
- Enable the Cloud Translation API.
- Create a service account.
- Download your private key in JSON format.
- Keep the full path to this file for the next step.
Go to the shell prompt on your Mac OS X, Linux, or Windows (Powershell) system and set the environment variable GOOGLE_APPLICATION_CREDENTIALS to the path of your JSON service account key using the following commands. This variable only applies to the current shell session. If you open a new session, you’ll need to reset this variable, replacing [PATH] with the path of the JSON file with your key.
If you’re using Linux or macOS:
export GOOGLE_APPLICATION_CREDENTIALS=”[PATH]”
For Windows, in PowerShell:
$env:GOOGLE_APPLICATION_CREDENTIALS=”[PATH]”
Or, from a command prompt:
set GOOGLE_APPLICATION_CREDENTIALS=[PATH]
Then install and initialize Google’s Cloud SDK. Depending on which operating system you’re using, the Cloud SDK may have a dependency on a version of Python that isn’t installed on your system. So be sure to double-check the Cloud SDK documentation to ensure the appropriate version of Python is installed. You’re off to the races!
Executing Your First Translation
Make a Translation API Request with a REST call using the v2 translate method.
- Use curl to make your request to the https://translation.googleapis.com/language/translate/v2 endpoint.
The command includes JSON with (1) the text to be translated (q), (2) the language to translate from (source), and (3) the language to translate to (target).
Source and target languages are identified with ISO-639-1 codes. In this example, the source language is English (en), the target is French (fr). The query format is plain “text”.
The sample curl command uses the gcloud auth application-default print-access-token command to get an authentication token.
curl -s -X POST -H “Content-Type: application/json” \
-H “Authorization: Bearer “$(gcloud auth application-default print-access-token) \
–data “{
‘q’: ‘The quick brown fox jumps over the lazy dog’,
‘source’: ‘en’,
‘target’: ‘fr’,
‘format’: ‘text’
}” https://translation.googleapis.com/language/translate/v2
The response should resemble the following:
{“data”: {
“translations”: [
{
“translatedText”: “Le renard brun rapide saute par-dessus le chien paresseux”
}
]
}
}
Congratulations, you fox! You’ve sent your first request to the Cloud Translation API!
Next Steps in the Translation Process
For most apps, you can rely on one of the over 100 language pairs already trained and tested. (If the pair you require is not available, or you need a custom translation, push grandma aside and resort to the AutoML training module.) The full process is as follows:
- Create a file containing the desired language pairs, using the CURL example above. Choose source and target languages from the list here (e.g., “en” or “fr”).
- Write code that reads the content of your Web site and makes a REST call to the Cloud Translation API (including a parameter pointing to your model and then producing a translated version of that text).
- Create a new page in your content management system to contain and then display the translated text. Even better, if your CMS is programmable (either directly or by way of API), improve the code by automating this step.
- Set your CMS and Web site to display the appropriate pages when a specific language is selected by your site’s end users.
Client libraries are currently available for seven popular programming languages – C#, Go, Java, Node.js, PHP, Python, and Ruby. Just install the library of your choice. Go to Translation Client Libraries for installation instructions.