Transcription Tools

Dozens of hours of interviews were recorded. Analysis begins with tagging phrases in the transcripts. The preferred tool for tagging (‘coding’) is NVivo, and at first it seemed sensible to do the transcription within NVivo. But I soon changed my mind.

NVivo 8

http://www.qsrinternational.com/products_nvivo.aspx
The obvious advantage of using NVivo is that the transcript will be created within the project file, and structured suitably. Playback can be controlled with Function keys: I just hit the F7 key to pause or resume playback and other keys to skip, accelerate or slow it. The three typing/playback modes deserve some attention: I found it easiest to use Transcribe mode for the first draft, using F7 to pause/resume and F8 at the end of each row, then to audit the whole recording in a playback mode to find errors, then to use the looping synchronised mode to concentrate on specific lines.

However, at 50% speed, extra noises appear that make the audio difficult to understand. There is no internal spell-check, so every transcript must be exported then reimported. And the export does not go smoothly for me. First, the export launches Word, but fails to open the document. Then, once the document is manually opened, the notorious Word 2007 page-orientation bug strikes: despite changing every margin and page layout setting in both the document, Word Normal Template, and printer settings, the document has page-breaks suited to Portrait layout but prints only the top ⅔ of each page. Further, the text is presented as a table using a non-standard text Style. To remove the ‘coding’ highlight, I found it necessary to change the Style of the whole document.

The final blow is its dependency on Microsoft SQL Server 2005. After NVivo crashes (about once a day) the residual SQL working files NVivo launching until I manually stop the SQL service, manually delete the residual files, and restart the SQL service (and sometimes, Windows, too).

For these reasons it was clearly worth trying another transcription environment.

ExpressScribe

http://www.nch.com.au/scribe/index.html
The most widely advertised product has a free version that is adequate, and fits in a family of related, useful tools. It works, and it implements a worthwhile workflow concept, but it is badly behaved adware, installing a raft of other NCH applications which cannot be uninstalled.

Silently installing unwanted, hard-to-remove components, it is behaving like a trojan. This should not be tolerated.

F5

http://www.audiotranskription.de/english/f5 for Mac,
http://www.audiotranskription.de/english/f4-v42 for Windows.

This tool is focused on one task, and does it superbly. Playback even at 50% speed is the clearest I’ve heard. As-you-type spell-check uses the familiar red squiggle. Press F6 to pause, F5 to play, resuming a few seconds before the pause point, or Command+Arrow to skip. In Preferences I defined a few hotkeys for common phrases such as [incomprehensible].

Finishing is easy enough. F5 saves in RTF form by default, or Word formats. I use Word to set the Title in Document Properties, open the document, convert text to table based on colons, delete the columns containing time marks, resize columns, hide borders, insert page numbers and insert the Title on a page header.

Voice Recognition

http://www.macspeech.com/pages.php?pID=181 for Mac,
http://www.voicerecognition.com.au/dragon-naturally-speaking-11/dragon-home.htm for Windows
These products are advertised as suitable for a single voice only at this stage.

Audio recordings

http://uk.yamaha.com/en/products/music-production/portable_recorders/pocketrak_2g/
A Yamaha PocketTrak 2G delivered beautiful recordings of live interviews (as MP3 files), but Telstra-recorded conference calls (as WAV files) were a surprisingly good second-best. iPhone recordings of face-to-face conversation in good conditions were adequate; recordings of speakerphone calls suffered very uneven quality.

At the time of interview, iPhone, Skype and GoogleVoice did not produce recordings. SkypeRecorder was successful in tests but not easy or convenient for the interviewer. Hardware adapters were not tested.

Current Solution

Transcription process

Transcription process