Month: September 2018

International Translation Day: In Praise of Data Expertise for Translation Scholars

By Katie King

Seattle, September 30, 2018

What percentage of humanists love data? It’s probably unmeasurable. But, there is growing interest among scholars in translation technology which has transformed the practice and theorization of translation (Munday 275). That is because we live in a digital world and number-crunching is encroaching on translation scholarship.  You’ll recognize the terms: crowd sourcing, fan subbing, audio/visual or multimedia translation, machine translation, corpus-based translation studies: technology is more accessible and more sophisticated to the “lay” user through a number of platforms and tools and is being embraced by more humanists as an enhancement to their research. You don’t have to be a computer scientist to do it.

For International Translation Day, I’d like to praise the use data analysis and of technology as part of every translation scholar’s toolkit. My argument is that translators along with all humanist scholars need training, guidance and expert support in using data and technology in research. This is a multidisciplinary approach to translator training that I support, and which I will continue to argue that universities should provide.

The data display in this web site is a prime example. The data is made available by the University of Rochester’s (UR) Three Percent Database, a unique repository of information on global literature translated into English in the U.S. over the last decade. From that database, I extract the data on Spain, review and correct any problems with it, then make it easily searchable and with an easy to understand and use visual display for others who are interested in Spain’s literature in translation. I don’t have the skills to create a public-facing searchable database, which is key to my research. Hiring an expert is costly.  Fortunately, a family friend is helping out, the brilliant Scottish programmer Tom Cranstoun who donated a bit of his time to write what he called a “simple” code on WordPress to create a clean and beautiful data display you see. Even with Tom’s help, I still needed to hone my skills in manipulating spreadsheets, sorting dirty or missing data, and tweaking code in WordPress and Google sheets. Chad Post, the creator and curator of the Three Percent Database, tells me that as far as he knows, I’m the only researcher making use of the data in this way. Both Chad and I are crowd-sourcing feedback on the data – we want users to comment, correct, add to the database. This is an exercise in public scholarship, a contribution to everyone’s understanding of how literary translation affects how a nation is perceived.

As I described in my article Why Literary Translation Data Matters posted earlier this year, the Three Percent Database – created and nurtured over ten years by publisher Chad Post – is unique. No one else is compiling literary translation data in this consistent, high quality way, albeit with limited parameters: fiction and poetry titles only published after January 2008, only works that have never before been translated into English, and only titles that are available for sale through traditional means in the U.S. Non-fiction and children’s title have recently been added.

Students of the UR’s translation studies program (MA in Literary Translation –MALT), as well as undergraduate students working toward a certificate in Literary Translation, are given the opportunity to work on the database and learn by doing. The MA emphasizes translation “as an artistic, technical and commercial enterprise,” and requires an internship with a publisher that can include working with the Three Percent Database.

In addition to training on the basics of data use and analysis, students of translation scholarship should be exposed to and learn tools that are increasingly in use by many humanists. The data visualization tool Tableau is a good example and the University of Washington offers some training and support for students who want to learn how to use it, including coaching by UW digital librarians.

Some translation studies programs offer courses in becoming a Wikipedia contributing translator to develop students’ understanding of crowd sourcing (Dolmaya). Other translation scholars are partnering with programming experts to develop their own machine translation programs for corpora research. (Toral). Google itself offers a platform for developers who wish to create their own machine translation program (Li).

These practices are not “instead of” traditional research but “in addition to” traditional research. They are an important part of what Munday calls the “interdiscipline” of translation.

Highlights of Three Percent data on Spain’s literature in translation:

  • Last month, in honor of Women in Translation Month, I used the data to write about women authors in Spain being under-represented in translation to English ( This is important to know but the numbers don’t explain why. This data will hopefully spur additional research.
  • One of the most prolific (34 in the last 10 years) publishers of titles from Spain is Small Stations Press, run by two people out of Bulgaria which translates almost exclusively from the Galician language. This is extraordinary on the face of it. In the world of modern publishing, two people can be as productive in niche areas as large publishers. But does output equal impact? More research is needed.
  • The second most prolific publisher of Spain’s literature in translation, Hispabooks (33 titles), has closed operations. After five years of publishing, and much praise for the quality of their work, financial constrains shut them down. I will be tracking overall Spain title numbers to understand much the success, or failure, or one publisher can affect the output of an entire nation on the world literary scene in English.
  • AmazonCrossing, the online retail giant Amazon’s fiction in translation imprint has had a huge impact on Spain’s access to the English language market producing 35 Spanish fiction titles translated into English in the last decade, the most of any publisher listed in the Database. Amazon’s aim is to select potential best sellers, but they also work to publish “worthy” titles in the mix (Page-Forte). Their production from Spain ranges from the literary (Juan Valera’s 19th century novel Doña Luz) to the very popular, such as the “Apocalypse Z” series of zombie novels by Manel Loureiro. AmazonCrossing also achieves gender parity in their selection of authors, choosing as many women as men. The question here is whether the promotional platform Amazon provides results in greater sales, and whether greater sales result in greater impact. In other words, do young readers of Spain’s fiction in English think first of zombies or Don Quixote? What are the implications?

Context and interrogation are important in data analysis. Here are some cautions about the Three Percent numbers to be considered when evaluating the data. The updates are irregular. Chad Post updates when he can. He’s a one-man operation. In 2016, he added 727 titles to the database across 12 data points provided by the publishers. Sometimes the publishers don’t provide all the data. Currently, Chad has a backlog of 300 children’s books to input (Post). Mistakes and delays happen and Chad calls on the users themselves for help in identifying and correcting them. The Publishers Weekly display of the data has a form to report errors and to add missing data. On this website, I ask readers to submit corrections as comments. But researchers should always double check the data as they use it.  Non-fiction and children’s books are new additions to Three Percent, so overall numbers from individual countries over time could look artificially inflated, if those two categories aren’t removed for a comparison by genre.

Despite these challenges, the Three Percent Database is a unique and praise-worthy enterprise, both as a research project and a training platform for translators and translation scholars at the University of Rochester. It is cited by European scholars who are crying out for a EU-based version (Donahaye). Perhaps the answer is that data doesn’t necessarily provide answers, but breaks ground for new questions and research opportunities across many areas of scholarship. Translation scholar José Lambert “defines translation as a key and methodological tool for interdisciplinary research on society” (Bartrina xiii).

Works Cited

Bartrina, Francesca and Carmen Millán. The Routledge Handbook of Translation Studies. Taylor and Francis, 2013.

Dolmaya, Julie McDonough M. “Analyzing the Crowdsourcing Model and Its Impact on Public Perceptions of Translation.” Translator 18.2 (2012): pp. 167-91. Web.

Donahaye, Jasmine. Three percent? Publishing Data and Statistics on Translated Literature in the United Kingdom and Ireland. Making Literature Travel. Swanswee, Wales. 2012. Mercator Institute Media, Languages, and Culture. Aberystwyth University, Wales UK.

Li, Fei-Fei. “Empowering Business and Editors to do More with AI,” The Keyword, Google. Web. Accessed Sept. 30, 2018.

Munday, Jeremy. Introducing Translation Studies: Theories and Applications. Fifth ed., Routledge, 2016.

Page-Fort, Gabriella. Lecture at the UW Simpson Center for the Humanities. May 23, 2016.

Post, Chad. Phone interview with the author, September 26, 2018.

Toral, Antonio, and Andy Way. What Level of Quality Can Neural Machine Translation Attain on Literary Text? 2018.


Posted by Katie King in About, 0 comments