定制化机器翻译创建

Selecting a baseline engine

The base engine is the foundation on which we will build the project. There are several offers in the market, such as Google, Microsoft, Amazon and even Apple, as well as smaller players in the industry. We work with the client to determine what engine is best suited to our client’s needs. To this end, we offer a multiple API as well as an automated testing system that will allow us to select the right engine for your content depending on your requirements.

Data selection and corpus preparation

The starting point for a Custom Neural Machine Translation Engine is to find and utilize previously translated materials involving content that is as similar as possible to what is to be translated. The more previously translated material available, the faster and more economical the process will be. If source and target are not already associated as translation memory units, an alignment can be performed to get the bilingual content needed to boost the engine’s performance.

Next Step: Monolingual (Target) Content
If sufficient amounts of target reference content exist, it is possible to leverage both style and terminology by adding that to the mix.
Last Step: Specialized Corpus from Additional Sources
We will search the web for materials that are aligned as closely as possible to the content that will run through the engine.
Investing time in searching for the best-quality corpora always pays off. The same applies to bilingual data that can be obtained from data marketplaces.

Re-training: New CNMTEs Improve with Human Post-Editing

A common configuration is to integrate human post-editing into the process to improve results. Under this workflow, the output from the Custom Neural Machine Translation Engine is edited by our expert linguists to improve its quality as well as to retrain the engine for future translation. As the reviewer modifies the output, the engine becomes more intelligent using a dynamic adaptive model. Moreover, as more translations flow through the engine, it becomes even more intelligent over time. In other words, the quality gap between full human translation and this solution narrows dramatically, while turnaround time and costs are significantly reduced. In our opinion, these engines will become an asset and a market differentiator for any client with recurrent needs.