zbMATH — the first resource for mathematics

Mid-level features for audio chord recognition using a deep neural network. (English) Zbl 1353.68232
Summary: Deep neural networks composed of several pre-trained layers have been successfully applied to various tasks related to audio processing. Some configurations of deep neural networks (including deep recurrent networks) which can be pretrained with the help of stacked denoising autoencoders are proposed and examined in this paper in application to feature extraction for audio chord recognition task. The features obtained from an audio spectrogram using such network can be used instead of conventional chroma features to recognize the actual chords in the audio recording. Chord recognition quality that was achieved using the proposed features is compared to the one that was achieved using conventional chroma features which do not rely on any machine learning technique.
68T05 Learning and adaptive systems in artificial intelligence
68T10 Pattern recognition, speech recognition
coin; R
Full Text: MNR