British Universities Film & Video Council

moving image and sound, knowledge and access

Intelligent Speech

 As part of the British Library’s commitment to delivering a multimedia research environment, they are exploring the potential of speech-to-text technologies for finding non-textual content. Luke McKernan, Lead Curator, News and Moving Image at the British Library, brings us up-to-date with their efforts.

Luke-McKernanAbout the author: Luke McKernan is Lead Curator, News and Moving Image at the British Library. He is the author of: Charles Urban: Pioneering the Non-Fiction Film in Britain and America, 1897-1925 (University of Exeter Press, 2013); Shakespeare on Film, Television and Radio: The Researcher’s Guide (BUFVC, 2009), co-editor with Eve-Marie Oesterlen and Olwen Terris; Moving Image Knowledge and Access: The BUFVC Handbook (BUFVC, 2007),  co-editor with Cathy Grant; Yesterday’s News: The British Cinema Newsreel Reader (BUFVC, 2002), editor; A Yank in Britain: The Lost Memoirs of Charles Urban, Film Pioneer (The Projection Box, 1999), editor; Who’s Who of Victorian Cinema: A Worldwide Survey (BFI, 1996), co-editor with Stephen Herbert; Walking Shadows: Shakespeare in the National Film and Television Archive (BFI, 1994), co-editor with Olwen Terris.

The British Library has over one million speech-based recordings. These include radio broadcasts, oral history recordings, interviews, speeches and television news programmes. Making these discoverable by researchers traditionally has depended on catalogue records, sometimes supplemented by content summaries and searchable transcriptions. The first can be relatively quick to create but usually provide only rudimentary information about the content of the recording; the second are enormously time-consuming to produce, though of course hugely valuable.

… speech-to-text technologies have reached an exciting stage where they are close to becoming adopted widely for large-scale operations

This is becoming an increasing problem for research in a digital age. Full-text searching of electronic and digitised print has revolutionised what we can discover, but the digital research environment is not limited to the printed word. Television programmes, films, radio broadcasts, music recordings, images, maps, data sets and anything else that holds knowledge and can be expressed in digital form is ripe for discovery, and of course we do discover these on our various databases. But how level is the playing field? We search with words, and receive results largely determined by the word content of the digital object. An object that has fewer words attached to it has less chance of being found in an integrated, multimedia research environment.

At the British Library we are committed to delivering a multimedia research environment, where non-textual forms have an equal chance of being discovered as content in books, journals and newspapers. Fortunately much exciting and innovative work is going on that aims to uncover the rich intelligence embedded in digital objects through image searching, face recognition, video analysis and much more. Focussing on those one million speech recordings whose detailed content currently lies hidden from researchers, we have become particularly interested in the potential of speech-to-text technologies.

BritishLibrary_GreenButton_prototype

« previous     1 2 3 4    next »