Transcription: Simple Task or Mad Challenge?

One of the services that localization agencies offer is transcription, which consists in converting an audio into written text. This can either be from a simple audio file such as WAV or MP3 or from a video file such as AVI, MP4, etc. The transcriptions can then be used for translation, voice recordings, or as evidence in a trial, just to name a few.

In the best case, one or multiple people are heard speaking clearly and in the same language, one at a time and in a relaxed manner, thus facilitating the task for the transcriptionist. That would be ideal. Generally, time stamps are also used to indicate the precise moment in which everything is said. They also usually include clarifications of what is happening or is heard in the background, such as laughter, coughing, background noise, etc.

A typical transcription looks like this:


Interviewer: Good morning, my name is Robert and I will be directing this interview.


Interviewee: Hi Robert, my name is Carlos, it’s a pleasure to meet you. I’m a little nervous, I admit [laughing].


Interviewer: [smiling] Don’t worry, this conversation is a simple formality, we have already seen your CV and we are aware of your experience and skills.

Up to this point, this transcription would be an example of an audio that is easy to understand and follow; but as in life, complications always tend to appear, as you will see outlined below:

  1. Background noise: Either the sound of traffic, sirens, other people talking or screaming in the background, strong wind, or any sound that competes with the leading voices of the audio. This delays the transcription time. In extreme cases, when it is impossible to understand a part of the audio it is marked with [inaudible] or [unintelligible].
  2. Voices in other languages: Often times, within the conversation, people who speak other languages participate. For example, an interview or report can be given in one language to someone who speaks another, and an interpreter is added to translate the interviewee. For these types of transcriptions, resources that speak these languages ​​are needed.
  3. Strong accents or slang: Perhaps an audio is in a single language, but sometimes the accent of the speaker or even the slang can be a real challenge for the transcriptionist. The more neutral an accent or vocabulary is, the better.
  4. Speed ​​of conversation or voice volume: It’s simple – the faster the speaker, the more difficult the transcription. The same happens when the speaker’s voice is very quiet, making it difficult to understand.

At Trusted Translations, we are aware of the problems that may arise with a transcription. That is why we have extremely experienced resources, with a great ear to complete these projects in a highly professional manner.