Data Centralization in a Translation Management System

This article was originally published on the Wordbee website. Wordbee are the makers of the popular translation management system and CAT tool for translators.

Nowadays every company is digitizing its business activities. The Internet of Things is changing the way we work. Apps are data hungry. Artificial intelligence and analytics systems are running on the same fuel. The result: Companies are increasingly dependent on data, and data traffic within every company is increasing exponentially. IDC calculates that by 2025 the traffic generated within companies will amount to 175 ZB (1000 billion gigabytes) of data.

And that’s not all. Many operational and production processes today already require a great amount of data to be able to work efficiently. In turn, business and production activities generate even more data, on which companies need to work to improve and make all workflows – and the final results – more and more efficient. 

All this obviously raises a big question about data centralization, i.e. the efficient use and management of data. After analyzing the advantages of moving your translation business to the cloud and offering a general roadmap to implement a cloud platform, now we want to discuss the importance of centralizing and organizing data in a cloud-based translation management system.

We’re going to take a shallow integration as a starting point, i.e. moving to the cloud only the necessary operations and applications, that can be scaled later on, as needed. 

Let’s start by identifying the data types available in a typical translation business: 

  • Language data (i.e. translation memories and termbases)
  • Vendor data (SLV, MLV, freelancers)
  • Customer data
  • Translation project data

This list doesn’t reflect a specific order. We choose to start with translation memories and termbases because this is the kind of data that is present in the highest volumes within a translation business/localization department. If you prefer, you can of course scroll down to the section that is most relevant to you.

Translation memories and termbases

Translation memories (TMs) and termbases are essential for your translation projects, but they are also the least problematic kind of data to migrate.

If you’ve never worked with TMs before, you could either start from zero or do an alignment to generate a translation repository.

If you already have TMs and termbases in formats like tmx, .cvs, .xlsx, and .tbx, you can import them into your cloud translation management system (TMS) without too much pre-processing. In case of translation memories and termbases with specific settings, the Wordbee tech team can, of course, help you with custom scripts and other input formats. In any case, we recommend you carefully set up the export tool in order to be able to preserve and save as much metadata as possible.  

Metadata, i.e. data about data, is information used to describe the data that is contained in something like a repository or a webpage/website etc. The metadata of a TM can help you to trace a translated segment back to a translator, a date and time, a specific project, specific document, and machine translation usage. This allows 1) a translator to choose more recent material to leverage or to delete segments that may contain an outdated translation; and 2) your project managers to handle your company’s TMs effectively.  

One final word on TMs and databases: set up a maintenance schedule for their periodic cleaning and update. Take a look at Strategies to KonMari Your Translation Memories for best practices. 

Translation vendor database 

Vendors are at the core of all the processes and activities of a translation business/localization department. A well-managed and updated vendor database is decisive for an effective, optimized, and transparent translation/localization workflow. For more on how to develop and optimize your vendor database, take a look at A Data-Driven Approach to Translation Vendor Management

Here are few suggestions on which information to include in your translation vendor database, besides the usual contact details and language pairs: 

  • Specialization domains (i.e. legal, software, mechanical etc…) 
  • Productivity rate (max. number of words per day vs. assignments) 
  • Pricelists and discounts 
  • Billing information 
  • Access permission to your company’s TMS 
  • Custom tags and five-star ratings 
  • NDA and other agreements signed 
  • GDPR-related clearance/disclosure 

These information elements are a good starting point to develop a business analytics strategy and obtain insight about the performance of your vendors. With this information you can determine the volumes vendors are turning out, evaluate response times, estimate scheduling or delivery times, and manage satisfaction ratings. For more on this, we invite you to read Transforming Translation Data into Business Analytics

Translation customer database 

A customer database is much more than a simple organizational support tool. What makes the difference between a good and a not so good customer database is how quickly you can find the right information. In addition to contact and financial details, a customer database should contain information about: 

  • Specific linguistic resources (TMs, termbases and other reference materials)  
  • Assigned vendors
  • Assigned account/project manager
  • Language combinations
  • Translation volumes (monthly/weekly)
  • Ratings (critical points, timely payments etc.)
  • NDA and other agreements signed
  • GDPR-related clearance/disclosure 

Like in the case of the translation vendor database, this customer data can help you gain insights on your processes and sales. Again, take a look at Transforming Translation Data into Business Analytics for more on this topic. 

Translation project data 

Although translation project data is often neglected, when analyzed, it could bring out many insights. A translation project manager’s work should consist, among other things, in generating reports based on translation project data to monitor and improve the translation/localization processes. 

During project start-up a translation project manager should enter as much data as possible. Here are few suggestions: 

  • Start date and end date of a project 
  • Early start/late start and early finish/late finish dates of the project 
  • Project milestones (expected deliverables, the batch completion date, the batch delivery date, etc.) 
  • Total number of words 
  • Number of words for each deliverable/batch 
  • Number and type of linguistic resources working on a translation project. 

The more project data you define, the more aggregate data you’ll be able to collect, the better information you’ll get to help you analyze the structure and progress of your translation business/localization department. For more ideas, check An Intro to KPIs for the Translation Industry.


Sign up for my monthly
#SmartReads on the Translation Industry

    Your email is safe with me and I will never share it with anyone.