Tools for searching, annotation and analysis of speech, music, film and video - A survey

Alan Marsden*, Adrian Mackenzie, Adam Lindsay, Harriet Nock, John Coleman, Greg Kochanski

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

14 Citations (Scopus)

Abstract

This article examines the actual and potential use of software tools in research in the arts and humanities focusing on audiovisual (AV) materials such as recorded speech, music, video and film. The quantity of such materials available to researchers is massive and rapidly expanding. Researchers need to locate the material of interest in the vast quantity available, and to organize and process the material once collected. Locating and organizing often depend on metadata and tags to describe the actual content, but standards for metadata for AV materials are not widely adopted. Content-based search is becoming possible for speech, but is still beyond the horizon for music, and even more distant for video. Copyright protection hampers research with AV materials, and Digital Rights Management (DRM) systems threaten to prevent research altogether. Once material has been located and accessed, much research proceeds by annotation, for which many tools exist. Many researchers make some kind of transcription of materials, and would value tools to automate this process. Such tools exist for speech, though with important limits to their accuracy and applicability. For music and video, researchers can make use of visualizations. A better understanding (in general terms) by researchers of the processes carried out by computer software and of the limitations of its results would lead to more effective use of Information and Communications Technology (ICT).

Original languageEnglish
Pages (from-to)469-488
Number of pages20
JournalLiterary and Linguistic Computing
Volume22
Issue number4
DOIs
Publication statusPublished - Nov 2007
Externally publishedYes

Fingerprint

Dive into the research topics of 'Tools for searching, annotation and analysis of speech, music, film and video - A survey'. Together they form a unique fingerprint.

Cite this