endedTranslational Research

Automatic Segmentation, Labelling, and Characterisation of Audio Streams

Automatische Segmentierung und Charakterisierung von Audio Streams

View on FWF Research Radar

Principal Investigator

Name: Gerhard Widmer
Role: Projektleiter:in
ORCID: 0000-0003-3531-1282
Institution: ÖFAI - Österreichisches Forschungsinstitut für Artifical Intelligence

Grant Details

Approval Date: 1 Oct 2012
Start Date: 1 Feb 2013
End Date: 30 Jun 2017
Approved Amount: € 447.716

Keywords & Classification

Keywords

Music Information Retrieval (MIR)Machine LearningAudio and Music Classification

Research Disciplines

Computer SciencesElectrical Engineering, Electronics, Information EngineeringArts

Research Fields

Electrical Engineering, Electronics, Information Engineering

Project Summary

The goal of this project is to develop technologies for the automatic segmentation and interpretation of audio files and audio streams deriving from different media worlds: music repositories, (Web and terrestrial) radio streams, TV broadcasts, etc. A specific focus is on streams in which music plays an important role. Specifically, the technologies to be developed should address the following tasks: (1) automatic segmentation (with or without meta-information) of audio streams into coherent or otherwise meaningful units or segments (based on general sound or rhythm similarity or homogeneity, on specific types of content and characteristics, on repeated occurrences of subsections, etc.); (2) the automatic categorisation of such audio segments into classes, and the association of segments and classes with meta-data derived from various sources (including the Web); (3) the automatic characterisation of audio segments and sound objects in terms of concepts intuitively understandable to humans. To this end, we plan to develop and/or improve and optimise computational methods that analyse audio streams, identify specific kinds of audio content (e.g., music, singing, speech, applause, commercials, ...), detect boundaries and transitions between songs, and classify musical and other segments into appropriate categories; that combine information from various sources (the audio signal itself, databases, the Internet) in order to refine the segmentation and gain meta-information; that automatically discover and optimise audio features that improve segmentation and classification; and that learn to derive comprehensible descriptions of audio contents from such audio features (via machine learning). The research is motivated by a large class of challenging applications in the media world that require efficient and robust audio segmentation and classification. Application scenarios include audio streaming services and Web stream analysis, automatic media monitoring, content- and descriptor-based search in large multimedia (audio) databases, and artistic applications. That there is a strong and very concrete demand for such methods is documented, among other things, by the fact that several companies from the media world have pledged to support this project with large amounts of real-world data and valuable meta-information.

Automatic Segmentation, Labelling, and Characterisation of Audio Streams

Principal Investigator

Grant Details

Keywords & Classification

Project Summary

Research Outputs (0)

No outputs linked

Further Funding (0)