Automatische Segmentierung und Charakterisierung von Audio Streams
View on FWF Research RadarKeywords
Research Disciplines
Research Fields
The goal of this project is to develop technologies for the automatic segmentation and interpretation of audio files and audio streams deriving from different media worlds: music repositories, (Web and terrestrial) radio streams, TV broadcasts, etc. A specific focus is on streams in which music plays an important role. Specifically, the technologies to be developed should address the following tasks: (1) automatic segmentation (with or without meta-information) of audio streams into coherent or otherwise meaningful units or segments (based on general sound or rhythm similarity or homogeneity, on specific types of content and characteristics, on repeated occurrences of subsections, etc.); (2) the automatic categorisation of such audio segments into classes, and the association of segments and classes with meta-data derived from various sources (including the Web); (3) the automatic characterisation of audio segments and sound objects in terms of concepts intuitively understandable to humans. To this end, we plan to develop and/or improve and optimise computational methods that analyse audio streams, identify specific kinds of audio content (e.g., music, singing, speech, applause, commercials, ...), detect boundaries and transitions between songs, and classify musical and other segments into appropriate categories; that combine information from various sources (the audio signal itself, databases, the Internet) in order to refine the segmentation and gain meta-information; that automatically discover and optimise audio features that improve segmentation and classification; and that learn to derive comprehensible descriptions of audio contents from such audio features (via machine learning). The research is motivated by a large class of challenging applications in the media world that require efficient and robust audio segmentation and classification. Application scenarios include audio streaming services and Web stream analysis, automatic media monitoring, content- and descriptor-based search in large multimedia (audio) databases, and artistic applications. That there is a strong and very concrete demand for such methods is documented, among other things, by the fact that several companies from the media world have pledged to support this project with large amounts of real-world data and valuable meta-information.
This project has no linked research outputs in the database.
No additional funding sources recorded.