MateCat: a Flavor of Italian Renaissance or Simply Sliced Bread?


This article was originally published in the TAUS Blog, on November 26, 2014.

The word Renaissance evokes different images in each of us. To me, as an Italian, it evokes great geniuses and personalities, like Leonardo Da Vinci and many other polymaths. Renaissance also conjures up the transition from a gloomy and stale culture to the resurgence of learning and the awakening of intellectual and artistic pursuits. Maybe polymaths are the reason why many English native speakers use Renaissance as an attribute for an eclectic persona with different skills and talents.

Although many claim that translation is the second oldest profession in the history of the world, the so-called translation industry is quite young, since we can trace its birth at the beginning of the 20th century and its major expansion immediately after WWII.

However, unlike most practitioners and academics who seem to be stuck in the past, the translation industry has been evolving hand in hand with technological innovation. It is a necessary survival strategy: in a world of increasing competition and ongoing changes, industry players have turned to technology to meet the ever-changing demands of translation buyers and to exploit unprecedented opportunities for growth.

There is still a lot of talk and expectations in some corners of the translation community about when and where the long-awaited technological disruption is going to come from. The reality is that the technological disruption took place 10 years ago: its name is Google Translate.

Some translation automation companies recognized the early signs of this disruption and acted accordingly by jumping on the translation automation wagon. Their first step was the integration of machine translation technologies in their workflows and environments. The second step was beaming the translation environments up to the cloud.

Nowadays machine translation is a must-have.

One of the last (but not least) examples of this trend is MateCat, whose latest beta version was officially launched on October 28, 2014, in Vancouver, on the eve of Localization World. The launch rounded up the 3-year research project partially funded by the European Union’s Seventh Framework Programme (for some 2.7 million euros). FBK (Fondazione Bruno Kessler) led the project consortium consisting of the University of EdinburghUniversité du Maine

We interviewed Marco Trombetti, CEO and cofounder of, who gave us a vivid sketch of his ‘big dream’. You can read the interview here.

The company and the project

Founded in 1999 and with some 48.000 clients, has always combined translation services and technological development.

Trombetti himself seems to be a Renaissance man. With a degree in physics, he not only leads, one of the first internet-based translation companies, but also Memopal, a provider of online backup services, and he invests inWanderio – a web application that helps you plan your travel from your doorstep to your final destination – and in a number of other startups. And in his LinkedIn profile we read he can solve Rubik’s cube in less than 3 minutes.

The MateCat project – which won the EU’s approval on the very first evaluation – is meant to provide the much-needed integration of machine translation and human translation into a single tool. “We started the project not because of the funding that was offered but because we wanted to build a web-based CAT tool around MyMemory and Moses,” Trombetti explains.

One of the company’s major projects, MyMemory is the largest translation memory publicly available and it is 100% free. Today the repository counts around 8-10 billion of words (without pivoting).

According to Trombetti, the MyMemory plugin is the most downloaded by Trados users. Nevertheless, translators’ complaints about the sometimes insufficient quality of machine translation suggestions had to be taken into account. “We tried to integrate collaborative TM and MT in an existing CAT tool with an API, but the results were not satisfactory.”

How clean and reliable is the data in MyMemory? “As in every collaborative TM platform there are errors, but the error rate is getting smaller. Like in the case of Wikipedia, the more contributions, the higher the quality. Five years ago we had a precision percentage of 93%, today it’s 96-97%.”

The technology

Based on two main technologies — SMT and collaborative TMs — MateCat runs as a web-server connected to the MyMemory TM server, to the commercial Google Translate (GT) and Microsoft MT servers, and to a number of Moses-based MT engines.

MateCat is available for free as SaaS under a LGPL license. There is also a server version that does not include the filters for file conversion and supports only the XLIFF format.

The work of the technological partners (FBK, University of Edinburgh and Université du Maine) was focused on improving the tag management and leveraging the adaptive MT technology: the SMT engine is quickly retrained based on the corrections and feedback of users/post-editors. “MateCat gives the translators only the positive parts of MT. As soon as you translate, MateCat collects the edit distance and the time-to-edit score. It statistically analyzes the data to decide how much MT is helping. If the MT engine gives good results, it will be ranked very high. But if you’re editing more than 45% of a sentence, MT is clearly adding no value, so it won’t be suggested as a primary option.”

Simplicity was also essential to the design. A zero learning curve is guaranteed: “Anyone can drag and drop a document and start translating.”

Within the company, MateCat has replaced the legacy systems for quite some time now, and some 3000 users have already registered with approx. 50 million words translated.

The post-editing tool

As a post-editing tool, MateCat collects timing information for each segment about the generated suggestions and the ones that have actually been post-edited, as well as the average translation speed, the post-editing effort and the percentage of suggestions coming from MT or the TM.

The business model

In addition to simplicity and user-friendliness, the real distinguishing factor seems to hide in the proposed business model.

If a translator with a specific language pair is not at hand, the translation project can be outsourced with one click to one of the translators registered in the database. At which point will charge a commission of 15-25% on the rate of the translation vendor offering the outsourcing service.

How translators are chosen remains an issue. The choice could be left to the LSPs or it could be based on T-RankTM, the automatic ranking system developed by T-RankTM analyzes various criteria (like quality, timeliness, and daily turnaround) related to translators and the translation job at hand. The value obtained with this analysis indicates to which degree each translator matches the perfect translator for the job.

Trombetti states that “MateCat acts like a lead generation tool, or better yet, a disintermediation tool that links translation service providers to the translation demand.”

Monetizing software licenses is not an interesting option. In Trombetti’s words, “according to our estimates, within the industry 250 million US dollars are spent in MT licenses and approximately the same amount for CAT tool licenses. Peanuts. Something so small is not worth pursuing.”

Even more interesting is the comparison he makes with Google and advertising. “The advertising industry was extremely fragmented. Through its search engine, Google was able to funnel the traffic and build a marketplace for ads.”

So where is the opportunity behind MateCat for “After a growth of 30% in 2 years, we want to create the next big thing, the billion dollar thing. Direct customers are difficult to reach. Thirty-seven billion of translation orders go to LSPs. With a CAT tool free for translators and LSPs, we can funnel the translation demand. We make it easy for LSPs to outsource the parts of a translation project that they for some reason cannot handle, which is approximately 10-20% of their total orders.”

The ‘big dream’

“Our dream: We would like to have 10,000 of the 30,000 translation companies using MateCat in the next 3 years, these are the companies that need to outsource peaks and non-supported languages. The result could be 1 billion dollar in translation services delivered by MateCat.”

In any case, Trombetti says, MateCat is already financially viable. With its integration in the’s production workflow and thanks to the increased leverage, the tool has brought an increase of a half million euros in gross margins.

What has sliced bread got to do with all this?

Although apparently audacious, the MateCat business model brings to mind the movie Working Girl, in which a character named Jack Trainer (played by Harrison Ford) says to Tess McGill (Melanie Griffith): “Oren Trask? The man who said, “What if we sliced the bread before we sold it?”

Or, if you prefer another comparison, think of Gillette’s razor and blade marketing strategy, with MateCat being the razor and the commission-based outsourcing capability and filters for file formats being the blade and the shaving cream.

It’s not really clear how and why this tool should help disintermediation either. Is willing to take care of the marketing? After the interview Alessandro Cattelan (MateCat Product Manager) rectifies: “MateCat is a CAT tool and outsourcing platform. LSPs will still need to market their services to attract customers, just like they are doing today.”

And are freelance translators (as SLVs) expected to pay a commission fee for the projects assigned to them? Cattelan again adjusts the aim: ”I think it is important to clarify that translators and SLVs are not expected to pay any commission fees. MateCat is just a CAT tool and does not interfere with the way LSPs and translators/SLVs co-operate nowadays.”

As it stands now, MateCat could end up being just another marketplace like, but with a much sharper technological edge. In any case, like it or not, the one factor that can make a real difference within a business plan is always the same: the people running it.


Sign up for my monthly
#SmartReads on the Translation Industry

    Your email is safe with me and I will never share it with anyone.