×

Classification in music research. (English) Zbl 1183.62109

Summary: Since a few years, classification in music research is a very broad and quickly growing field. Most important for adequate classification is the knowledge of adequate observables or deduced features on the basis of which meaningful groups or classes can be distinguished. Unsupervised classification additionally needs an adequate similarity or distance measure and grouping is to be based upon. Evaluation of supervised learning is typically based on the error rates of the classification rules. We first discuss typical problems and possible influential features derived from signal analysis, mental mechanisms or concepts, and compositional structures. Then, we present typical solutions of such tasks related to music research, namely for organization of music collections, transcription of music signals, cognitive psychology of music, and compositional structure analysis.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)
00A65 Mathematics and music
PDF BibTeX XML Cite
Full Text: DOI

References:

[1] Adak S (1998). Time-dependent spectral analysis of nonstationary time series. J Am Stat Assoc 93: 1488–1501 · Zbl 1064.62565
[2] Ahrendt P (2006) Music genre classification systems–a computational approach. PhD thesis, Technical University of Denmark, DTU
[3] Alonso M, David B, Richard G (2003) A study of tempo tracking algorithms from polyphonic music signals. In: Proceedings of the 4th COST 276 workshop, information and knowledge management for integrated media communication, Bordeaux, France, pp 1–5
[4] Amatriain X, Arumi P, Ramirez M (2002) CLAM, yet another library for audio and music processing? In: Proceedings of the 17th annual acm conference on Object-Oriented Programming, Systems, Languages and Applications, ACM press, Seattle, WA, USA, pp 46–47
[5] von Ameln F (2001) Blind source separation in der Praxis. Diplomarbeit, Fachbereich Statistik, Universität Dortmund, Dortmund, Germany
[6] Arenas-Garca J, Larsen J, Hansen LK, Meng A (2006) Optimal filtering of dynamics in short-time features for music organization. In: Proceedings of the 7th international conference on music information retrieval, Victoria, Canada, pp 290–295
[7] Aucouturier JJ, Pachet F (2002) Finding songs that sound the same. In: Proceedings of the IEEE Benelux workshop on model based processing and coding of audio, Leuven, Belgium, pp 1–8
[8] Aucouturier JJ and Pachet F (2004). Improving timbre similarity: how high is the sky. J Neg Results Speech Audio Sci 1(1): 1–13
[9] Bainbridge D, Cunningham SJ, Downie JS (2004) Visual collaging of music in a digital library. In: Proceedings of the 5th international conference on music information retrieval, pp 397–402
[10] Baumann S (2003) Music similarity analysis in a P2P environment. In: Proceedings of the 4th European workshop on image analysis for multimedia interactive services, London, UK, pp 314–319
[11] Beran J (2004). Statistics in musicology. Chapman & Hall/CRC, Boca Raton · Zbl 1033.62109
[12] Berenzweig A, Ellis D, Lawrence S (2002) Using voice segments to improve artist classification of music. In: Proceedings of the 22nd international AES conference, Espoo, Finland, pp 119–122
[13] Berenzweig A, Ellis D, Lawrence S (2003) Anchor space for classification and similarity measurement of music. In: Proceedings of the IEEE international conference on multimedia and expo, pp I–29–32
[14] Berenzweig A, Logan B, Ellis D and Whitman B (2004). A large-scale evaluation of acoustic and subjective music-similarity measures. Comput Music J 28(2): 63–76
[15] Bloomfield P (2000). Fourier analyis of time series–an introduction, 2nd edn. Wiley, New York · Zbl 0994.62093
[16] Brandenburg K, Popp H (2000) An introduction to MPEG Layer 3. EBU Technical review
[17] Breiman L, Friedman J, Olshen R and Stone C (1984). Classification and regression trees. Wadsworth, Belmont · Zbl 0541.62042
[18] Brillinger D (1975). Time series: data analysis and theory. Holt, Rinehart & Winston Inc., New York · Zbl 0321.62004
[19] Brown H, Butler D and Jones M (1994). Musical and temporal influences on key discovery. Music Percept 11(4): 371–407
[20] Bruderer M (2003) Automatic recognition of musical instruments. Master thesis, Ecole Polytechnique Fédérale de Lausanne
[21] Cano P, Loscos A, Bonada J (1999) Score-performance matching using HMMs. In: Proceedings of the international computer music conference, Beijing, China, pp 441–444
[22] Cano P, Kaltenbrunner M, Gouyon F, Battle E (2002) On the use of FastMap for audio retrieval and browsing. In: Proceedings of the 3rd international conference on music information retrieval, Paris, France, pp 275–276
[23] Cemgil A and Kappen B (2003). Monte Carlo methods for tempo tracking and rhythm quantization. J Artif Intell Res 18: 45–81 · Zbl 1045.68144
[24] Cemgil A, Kappen B, Desain P and Honing H (2001). On tempo tracking: tempogram representation and Kalman filtering. J New Music Res 29(4): 259–273
[25] Cemgil T, Desain P and Kappen B (2000). Rhythm quantization for transcription. Comput Music J 24(2): 60–76
[26] Chew E (2000) Towards a mathematical model of tonality. PhD thesis, Department of Operaitons Research, MIT, Cambridge
[27] Chuan CH, Chew E (2005) Audio key finding: considerations in system design, and the selecting and evaluating of solutions. In: International conference on multimedia and expo (ICME), pp 21–24
[28] Costa M, Fine P and Ricci Bitti PE (2004). Interval distribution, mode, and tonal strength of melodies as predictors of perceived emotion. Music Percept 22(1): 1–14
[29] Dahlhaus R (1997). Fitting time series models to nonstationary processes. Ann Stat 25: 1–37 · Zbl 0871.62080
[30] Davies M, Plumbley M (2004) Causal tempo tracking of audio. In: Proceedings of the 5th international conference on music information retrieval, Audiovisual Institute, Universitat Pompeu Fabra, Barcelona, Spain, pp 164–169
[31] Davy M, Godsill S (2002) Bayesian harmonic models for musical pitch estimation and analysis. Technical Report 431, Cambridge University Engineering Department, Cambridge
[32] Dixon S (1996). Multiphonic note identification. Aust Comput Sci Commun 17(1): 318–323
[33] Dixon S, Goebl W and Cambouropoulos E (2006). Perceptual smoothness of tempo in expressively performed music. Music Percept 23(3): 195–214
[34] Downie JS (1999) Evaluating a simple approach to music information retrieval: Conceiving melodic n-grams as text. PhD thesis, Faculty of Information and Media Studies, University of Western Ontario, London (Ontario), Canada, http://people.lis.uiuc.edu.jdownie/mir_papers/thesis_missing_some_music_figs.pdf
[35] Eerola T, Järvinen T, Louhivuori J and Toiviainen P (2002). Statistical features and perceived similarity of folk melodies. Music Percept 18(3): 275–296
[36] Ellis D, Whitman B, Berenzweig A, Lawrence S (2002) The quest for ground truth in musical artist similarity. In: Proceedings of the 3rd international conference on music information retrieval, pp 170–177
[37] Evangelista G (2001). Flexible wavelets for music signal processing. J New Music Res 30(1): 13–22
[38] Faloutsos C, Lin KI (1995) FastMap: A fast algorithm for indexing, data mining and visualization of traditional and multimedia datasets. In: Carey MJ, Schneider DA (eds) Proceedings of the 1995 ACM SIGMOD international conference on management of data, San Jose, pp 163–174
[39] Flexer A, Pampalk E, Widmer G (2005) Hidden Markov models for spectral similarity of songs. In: Proceedings of the 8th international conference on digital audio effects, Madrid, Spain
[40] Foote J (2002) Audio retrieval by rhythmic similarity. In: Proceedings of the 3rd international conference on music information retrieval
[41] Foote J, Uchihashi S (2001) The beat spectrum: a new approach to rhythm analysis. In: Proceedings of the IEEE international conference on multimedia and expo, Tokyo, Japan, pp 224–228
[42] Friedman J (1989). Regularized discriminant analysis. J Am Stat Assoc 84: 165–175
[43] Fucks W (1962). Mathematical analysis of formal structure of music. IEEE Trans Inform Theory 8(5): 225–228
[44] Fucks W (1963) Mathematische Analyse von Formalstrukturen von Werken der Musik (mit Diskussion). In: Arbeitsgemeinschaft für Forschung des Landes Nordrhein-Westfalen, Westdeutscher Verlag, Köln und Opladen, pp 39–114
[45] Fucks W (1964) Gibt es mathematische Gesetze in Sprache und Musik? In: Frank H (ed) Kybernetik – Brücke zwischen den Wissenschaften, Umschau Verlag, Frankfurt am Main, pp 171–183
[46] Fucks W (1968). Nach allen Regeln der Kunst. DVA, Stuttgart
[47] Fucks W and Lauter J (1965). Exaktwissenschaftliche Musikanalyse. Westdeutscher Verlag, Köln und Opladen
[48] Godsill S, Davy M (2003) Bayesian modelling of music audio signals. In: Bulletin of the International Statistical Institute, 54th Session, Berlin, vol LX, book 2, pp 504–506
[49] Godsill S, Davy M (2005) Bayesian computational models for inharmonicity in musical instruments. In: IEEE workshop on applications of signal processing to audio and acoustics, New Paltz, NY, pp 283–286
[50] Gomez E (2004). Tonal description of polyphonic audio for music content processing. INFORMS J Comput Spec Clust Comput Music 18(3): 294–304
[51] Gomez E (2006) Tonal description of music audio signals: harmonic pitch class profiles, tonality and tonal similarity of polyphonic audio signals. PhD thesis, Departament de Tecnologia, Universitat Pompeu Fabra, Barcelona, Spain
[52] Goto M (2003) A chorus-section detecting method for musical audio signals. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing, pp 437–440
[53] Goto M (2004) A predominant-F0 estimation method for polyphonic musical audio signals. In: Proceedings of the 18th international congress on acoustics (ICA’04), Acoustical Society of Japan, Kyoto, Japan, pp 1085–1088
[54] Gouyon F (2005) A computational approach to rhythm description: Audio features for the computation of rhythm periodicity functions and their use in tempo induction and music content processing. PhD thesis, Universitat Pompeu Fabra, Departament de Tecnologia, Barcelona, Spain
[55] Gouyon F and Dixon S (2005). A review of automatic rhythm description systems. Comput Music J 29(1): 34–54
[56] Gouyon F, Klapuri A, Dixon S, Alonso M, Tzanetakis G, Uhle C and Cano P (2006). An experimental comparison of audio tempo induction algorithms. IEEE Trans Speech Audio Process 14(5): 1832–1844
[57] Gromko JE (1993). Perceptual differences between expert and novice music listeners at multidimensional scaling analysis. Psychol Music 21: 34–47
[58] Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer, New York, http://www-stat.stanford.edu.tibs/ElemStatLearn/ · Zbl 0973.62007
[59] Herre J, Allamanche E, Ertel C (2003) How similar do songs sound? Towards modeling human perception of musical similarity. In: Proceedings of the IEEE workshop on applications of signal processing to audio and acoustics, pp 83–86
[60] Herrera P, Sandvold V, Gouyon F (2004) Percussion-related semantic descriptors of music audio files. In: Proceedings of the 25th international AES conference, London, United Kingdom
[61] Hyvärinen A, Karhunen J and Oja E (2001). Independent component analysis. Wiley, New York
[62] Jürgensen F, Knopke I (2004) A comparison of automated methods for the analysis of style in fifteenth-century song intabulations. In: Parncutt R, Kessler A, Zimmer F (eds) Proceedings of the conference on interdisciplinary musicology (CIM04), http://www-gewi.uni-graz.at/staff/parncutt/cim04/CIM04_paper_pdf/JurgensenKnopke.pdf
[63] Kantz H and Schreiber T (1997). Nonlinear time series analysis. Cambridge University Press, Cambridge · Zbl 0873.62085
[64] Klapuri A (2001) Multipitch estimation and sound separation by the spectral smoothness principle. In: IEEE international conference on acoustics, speech and signal processing (ICASSP), vol 5, pp 3381–3384
[65] Klapuri A (2004). Automatic music transcription as we know it today. J New Music Res 33(3): 269–282
[66] Klapuri A, Davy M (eds) (2006). Signal processing methods for music transcription. Springer, New York
[67] Kleber B (2002) Evaluation von Stimmqualität in westlichem, klassischen Gesang. Diploma Thesis, Fachbereich Psychologie, Universität Konstanz, Germany
[68] Knees P, Pampalk E, Widmer G (2004) Artist classification with web-based data. In: Proceedings of the 5th international conference on music information retrieval. Barcelona, Spain, pp 517–524
[69] Knuth D (1984). The TEXbook. Addison-Wesley, Reading
[70] Kohonen T (1995). Self-organizing maps. Springer, Berlin · Zbl 0957.68097
[71] Kopiez R, Weihs C, Ligges U and Lee JI (2006). Classification of high and low achievers in a music sight-reading task. Psychol Music 34(1): 5–26
[72] Koza JR (1992). Genetic programming: on the programming of computers by means of natural selection. MIT, Cambridge · Zbl 0850.68161
[73] Kranenburg Pv, Backer E (2004) Musical style recognition–a quantitative approach. In: Parncutt R, Kessler A, Zimmer F (eds) Proceedings of the conference on interdisciplinary musicology (CIM04), http://www-gewi.uni-graz.at/staff/parncutt/cim04/CIM04_paper_pdf/Kranenburg_Backer_CIM04_proceedings.pdf
[74] Krumhansl CL (1990). Cognitive foundations of musical pitch. Oxford Psychology Series 17. Oxford University Press, Oxford
[75] Kulesh V, Sethi I, V P (2003) Indexing and retrieval of music via Gaussian mixture models. In: Proceedings of the 3rd international workshop on content based multimedia indexing, Rennes, France, pp 201–205
[76] Kullback S and Leibler RA (1951). On information and sufficiency. Ann Math Stat 22: 79–86 · Zbl 0042.38403
[77] Kurth F, Gehrmann T, Müller M (2006) The cyclic-beat spectrum: Tempo-related audio features for time-scale invariant audio identification. In: Proceedings of the 7th international conference on music information retrieval, pp 35–40
[78] Lambrou T, Kudumakis P, Speller R, Sandler M, Linney A (1998) Classification of audio signals using statistical features on time and wavelet transform domains. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing, vol 6, pp 3621–3624
[79] Lamport L (1994). LATEX , a document preparation system, 2nd edn. Addison-Wesley, Reading · Zbl 0824.68121
[80] Lehwark P, Risi S, Ultsch A (2007) Visualization and clustering of tagged music data. In: Proceedings GfKl 2007, Freiburg, Germany (to appear)
[81] Lesaffre M, Tanghe K, Martens G, Moelants D, Leman M, De Baets B, De Meyer H, Martens JP (2003) The MAMI query-by-voice experiment: collecting and annotating vocal queries for music information retrieval. In: Proceedings of the 4th international conference on music information retrieval, Baltimore, Maryland, USA and Library of Congress, Washington, DC, USA, pp 65–71
[82] Levy M, Sandler M (2006) Lightweight measures for timbral similarity of musical audio. In: Proceedings of the first ACM workshop on audio and music computing multimedia (AMCMM). ACM, New York, pp 27–36
[83] Li D, Sethi I, Dimitrova N and McGee T (2001). Classification of general audio data for content-based retrieval. Pattern Recogn Lett 22: 533–544 · Zbl 1010.68859
[84] Li T, Ogihara M, Li Q (2003) A comparative study on content-based music genre classification. In: Proceedings of the 26th international ACM SIGIR conference on research and development in information retrieval. ACM, New York, pp 282–289
[85] Lidy T, Rauber A (2005) Evaluation of feature extractors and psycho-acoustic transformations for music genre classification. In: Proceedings of the 6th international conference on music information retrieval, pp 34–41
[86] Ligges U (2006) Transkription monophoner Gesangszeitreihen. Dissertation, Fachbereich Statistik, Universität Dortmund, Dortmund, Germany, http://hdl.handle.net/2003/22521
[87] Logan B (2000) Mel frequency cepstral coefficients for music modeling. In: Proceedings of the first international conference on music information retrieval, pp 23–25
[88] Logan B, Salomon A (2001) A music similarity function based on signal analysis. In: Proceedings of the IEEE international conference on multimedia and expo, pp 745–748
[89] Mandel M, Ellis D (2005) Song-level features and SVMs for music classification. In: Proceedings of the 6th international conference on music information retrieval, pp 594–599
[90] Markuse B and Schneider A (1996). ähnlichkeit, Nähe, Distanz: zur Anwendung multidimensionaler Skalierung in musik-wissenschaftlichen Untersuchungen. Systematische Musikwissenschaft / Systematic Musicology / Musicologie syst[[’e]]matique 4: 53–89
[91] McEnnis D, McKay C, Fujinaga I, Depalle P (2005) jAudio: a feature extraction library. In: Proceedings of the 6th international conference on music information retrieval, pp 600–603
[92] McKinney M, Breebaart J (2003) Features for audio and music classification. In: Proceedings of the 4th international conference on music information retrieval, pp 151–158
[93] Meng A (2006) Temporal feature integration for music organisation. PhD thesis, Informatics and Mathematical Modelling, Technical University of Denmark, DTU
[94] Meng A, Ahrendt P, Larsen J (2005) Improving music genre classification by short-time feature integration. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing, vol V, pp 497–500
[95] Meng A, Ahrendt P, Larsen J and Hansen LK (2006). Temporal feature integration for music genre classification. IEEE Trans Signal Process 15: 1654–1664
[96] Meyer J (1995) Akustik und musikalische Aufführungspraxis. Bochinsky, Frankfurt am Main
[97] Meyer LB (1957). Meaning in music and information theory. J Aesthet Art Criticism 15: 412–424
[98] Microsoft Corporation (1991) Multimedia programming interface and data specification, 1.0. Joint design by IBM Corporation and Microsoft Corporation
[99] MIDI Manufacturers Association (2001) Complete MIDI 1.0 Detailed Specification, 2nd edn, http://www.midi.org
[100] Mierswa I and Morik K (2005). Automatic feature extraction for classifying audio data. Mach Learn J 58: 127–149 · Zbl 1073.68779
[101] Mierswa I, Wurst M, Klinkenberg R, Scholz M, Euler T (2006) YALE: Rapid prototyping for complex data mining tasks. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, NY, USA, pp 935–940
[102] Moles A (1958). Th[[’e]]orie de l’information et perception est[[’e]]tique. Flammarion, Paris
[103] Moles A (1971). Informationstheorie und ästhetische Wahrnehmung. DuMont Schauberg, Köln
[104] Moore BCJ and Glasberg BR (1996). A revision of Zwickers loudness model. ACTA Acustica 82: 335–345
[105] Mörchen F, Ultsch A, Nöcker M, Stamm C (2005a) Databionic visualization of music collections according to perceptual distance. In: Proceedings of the 6th international conference on music information retrieval, pp 396–403
[106] Mörchen F, Ultsch A, Thies M, Löhken I, Nöcker M, Stamm C, Efthymiou N, Kümmerer M (2005b) MusicMiner: visualizing timbre distances of music as topographical maps. Tech. rep., Department of Mathematics and Computer Science, University of Marburg, Germany
[107] Mörchen F, Mierswa I, Ultsch A (2006a) Understandable models of music collections based on exhaustive feature generation with temporal statistics. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, Philadelphia, PA, USA, pp 882–891
[108] Mörchen F, Ultsch A, Thies M and Löhken I (2006b). Modelling timbre distance with temporal statistics from polyphonic music. IEEE Trans Speech Audio Process 14(1): 81–90
[109] Müllensiefen D and Frieler K (2004a). Cognitive adequacy in the measurement of melodic similarity: Algorithmic vs. human judgments. Comput Musicol 13: 147–176
[110] Müllensiefen D, Frieler K (2004b) Optimizing measures of melodic similarity for the exploration of a large folk song database. In: 5th international conference on music information retrieval, Audiovisual Institute, Universitat Pompeu Fabra, Barcelona, Spain, pp 274–280
[111] Müllensiefen D, Hennig C (2006) Modeling memory for melodies. In: Spiliopoulou M, Kruse R, Borgelt C, Nürnberger A, Gaul W (eds) From data and information analysis to knowledge engineering, Springer, Berlin, pp 732–739
[112] Narmour E (1990) The Analysis and Cognition of Basic Melodic Structures: The Implication-Realization Model. University of Chicago Press, Chicago
[113] Nienhuys HW, Nieuwenhuizen J, et al (2005) GNU LilyPond–the music typesetter. Free Software Foundation, http://www.lilypond.org/ ,version 2.6.5
[114] Ombao H, Raz J, Malow B and Sachs R (2001). Automatic statistical analysis of bivariate nonstationary time series. J Am Stat Assoc 96(454): 543–560 · Zbl 1018.62080
[115] Oppenheim A, Schafer R and Buck J (1999). Discrete-time signal processing, 2nd edn. Prentice-Hall, New Jersey
[116] Pachet F, Zils A (2003) Evolving automatically high-level music descriptors from acoustic signals. In: Proceedings of the international symposium on computer music modeling and retrieval, pp 42–53
[117] Pampalk E (2004) A MATLAB toolbox to compute music similarity from audio. In: Proceedings of the 5th international conference on music information retrieval, Barcelona, Spain, pp 254–257
[118] Pampalk E (2006a) Audio-based music similarity and retrieval: Combining a spectral similarity model with information extracted from fluctuation patterns. In: 3rd Annual Music Information Retrieval eXchange (MIREX’06), http://pampalk.at/publications/
[119] Pampalk E (2006b) Computational models of music similarity and their application in music information retrieval. PhD thesis, Computer Science Department, Technical University Vienna, Austria
[120] Pampalk E, Goto M (2006) MusicRainbow: a new user interface to discover artists using audio-rased similarity and web-based labeling. In: Proceedings of the 7th international conference on music information retrieval, pp 367–370
[121] Pampalk E, Rauber A, Merkl D (2002) Content-based organization and visualization of music archives. In: Proceedings of the 10th ACM international conference on multimedia, pp 570–579 · Zbl 1013.68874
[122] Pampalk E, Dixon S, Widmer G (2003a) Exploring music collections by browsing different views. In: Proceedings of the 4th international conference on music information retrieval, pp 201–208
[123] Pampalk E, Dixon S, Widmer G (2003b) On the evaluation of perceptual similarity measures for music. In: Proceedings of the international conference on digital audio effects, pp 6–12
[124] Pampalk E, Flexer A, Widmer G (2005) Hierarchical organization and description of music collections at the artist level. In: Proceedings of the 9th European conference on research and advanced technology for digital libraries, pp 37–48
[125] Pang H and Yoon D (2005). Automatic detection of vibrato in monophonic music. Pattern Recogn 38(7): 1135–1138 · Zbl 02173549
[126] Pearce MT and Wiggins GA (2004). Improved methods for statistical modelling of monophonic music. J New Music Res 33(4): 367–385
[127] Pearce MT and Wiggins GA (2006). Expectation in melody: the influence of context and learning. Music Percept 23(5): 377–405
[128] Pierce JR (1992). The science of musical sound, 2nd ed. W.H. Freeman and Co., New York
[129] Plumbley M (2003). Algorithms for nonnegative independent component analysis. IEEE Trans Neural Netw 14(3): 534–543
[130] Plumbley M (2004) Optimization using Fourier expansion over a geodesic for non-negative ICA. In: Proceedings of the international conference on independent component analysis and blind signal separation (ICA 2004), Granada, Spain, pp 49–56
[131] Plumbley M, Abdallah S, Blumensath T, Jafari M, Nesbit A, Vincent E, Wang B (2006) Musical audio analysis using sparse representations. In: COMPSTAT 2006–proceedings in computational statistics, Physica Verlag, Heidelberg, pp 104–117
[132] Pohle T (2006) Post processing music similarity computations. In: The second annual music information retrieval evaluation eXchange (MIREX 2006), pp 16–18, http://www.music-ir.org/evaluation/MIREX/2006_abstracts/AS_pohle.pdf
[133] Pohle T, Pampalk E, Widmer G (2005) Evaluation of frequently used audio features for classification of music into perceptual categories. In: Proceedings of the 4th international workshop on content-based multimedia indexing (CBMI), Riga, Latvia
[134] Polotti P, Evangelista G (2000) Harmonic-band wavelet coefficient modeling for pseudo-periodic sound processing. In: Proceedings of the COST G-6 conference on digital audio effects (DAFX-00), Verona, Italy, pp 103–108
[135] Polotti P, Evangelista G (2001) Multiresolution sinusoidal/stochastic model for voiced-sounds. In: Proceedings of the COST G-6 conference on digital audio effects (DAFX-01), Limerick, Ireland, pp 120–124
[136] Pressing J, Lawrence P (1993) Transcribe: a comprehensive autotranscription program. In: Proceedings of the international computer music conference, Tokyo, Japan, pp 343–345
[137] Pye D (2000) Content-based methods for managing electronic music. In: Proceedings of the international conference on acoustics, speech, and signal processing, pp 2437–2440
[138] R Development Core Team (2007) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, http://www.R-project.org ,ISBN3-900051-07-0
[139] Rabiner L and Juang BH (1993). Fundamentals of speech recognition. Prentice-Hall, New York
[140] Rabiner LR (1989). A tutorial on Hidden Markov Models and selected applications in speech recognition. Proc IEEE 77(2): 257–286
[141] Raphael C (2001). A probabilistic expert system for automatic musical accompaniment. J Comput Graph Stat 10(3): 487–512 · Zbl 04568636
[142] Risi S, Mörchen F, Ultsch A, Lewark P (2007) Visual mining in music collections with emergent SOM. In: Proceedings workshop on self-organizing maps (WSOM) (to appear)
[143] Rossignol S, Depalle P, Soumagne J, Rodet X, Collette JL (1999a) Vibrato: detection, estimation, extraction, modification. In: Proceedings of the COST-G6 workshop on digital audio effects (DAFx-99)
[144] Rossignol S, Rodet X, Soumagne J, Collette JL and Depalle P (1999). Automatic characterisation of musical signals: feature extraction and temporal segmentation. J New Music Res 28(4): 281–295
[145] Röver C, Klefenz F, Weihs C (2005) Identification of musical instruments by means of the Hough-transformation. In: Weihs C, Gaul W (eds) Classification–the ubiquitous challenge. Springer, Berlin, pp 608–615
[146] Rubner Y, Tomasi C, Guibas LJ (1998) A metric for distributions with applications to image databases. In: Proceedings of the IEEE international conference on computer vision, Bombay, India, pp 59–66
[147] Salton G and Buckley C (1988). Term-weighting approaches in automatic text retrieval. Inform Process Manage 24(5): 513–523
[148] Sandvold V, Herrera P (2005) Towards a semantic descriptor of subjective intensity in music. In: Proceedings of the international computer music conference
[149] Schedl M, Pohle TP, Knees P, Widmer G (2006) Assigning and visualizing music genres by web-based co-occurance analysis. In: Proceedings of the 7th international conference on music information retrieval, pp 260–265
[150] Scheirer ED (1998). Tempo and beat analysis of acoustic musical signals. J Acoust Soc Am 103(1): 588–601
[151] Schellenberg EG (1997). Simplifying the implication-realization model of melodic expectancy. Music Percept 14: 295–318
[152] Seidner W and Wendler J (1997). Die Sängerstimme. Henschel, Berlin
[153] Shao X, Xu C, Kankanhalli MS (2004) Unsupervised classification of music genre using Hidden Markov Model. In: Proceedings of the IEEE international conference on multimedia and expo, pp 2023–2026
[154] Shapiro S (1978). Feature space transforms for curve detection. Pattern Recogn 10: 129–143 · Zbl 0379.68065
[155] Smaragdis P, Brown J (2003) Non-negative matrix factorization for polyphonic music transcription. In: IEEE workshop on applications of signal processing to audio and acoustics, pp 177–180
[156] Steinbeck W (1982). Struktur und ähnlichkeit: Methoden automatisierter Melodieanalyse. Bärenreiter, Kassel
[157] Stenzel R, Kamps T (2005) Improving content-based similarity measures by training a collaborative model. In: Proceedings of the 6th international conference on music information retrieval, pp 264–271
[158] Stevens S and Volkmann J (1940). The relation of pitch to frequency. Am J Psychol 53(3): 329–353
[159] Streich S, Herrera P (2004) Toward describing perceived complexity of songs: computational methods and implementation. In: Proceedings of the 25th international AES conference
[160] Streich S, Herrera P (2005) Detrended fluctuation analysis of music signals: Danceability estimation and further semantic characterization. In: Proceedings of the 118th AES convention
[161] Temperley D (2001). The cognition of basic musical structures. MIT, Cambridge
[162] Temperley D (2004). Bayesian models of musical structure and cognition. Music Sci 8(2): 175–205
[163] Temperley D (2006) A probabilistic model of melody perception. In: Proceeding of the 7th international conference on music information retrieval, pp 276–279, http://ismir2006.ismir.net/PAPERS/ISMIR0630_Paper.pdf
[164] Temperley D (2007). Music and probability. MIT, Cambridge · Zbl 1136.00303
[165] Thomassen J (1982). Melodic accent: experiments and a tentative model. J Acoust Soc Am 71: 1596–1605
[166] Torrens M, Hertzog P, Arcos JL (2004) Visualizing and exploring personal music libraries. In: Proceedings of the 5th international conference on music information retrieval, pp 421–424
[167] Tzanetakis G and Cook P (2000). MARSYAS: a framework for audio analysis. Organ Sound 4(30): 169–175
[168] Tzanetakis G and Cook P (2002). Musical genre classification of audio signals. IEEE Trans Speech Audio Process 10(5): 293–302
[169] Tzanetakis G, Ermolinskyi A, Cook P (2002a) Beyond the query-by-example paradigm: New query interfaces for music. In: Proceedings of the international computer music conference, pp 177–183
[170] Tzanetakis G, Ermolinskyi A, Cook P (2002b) Pitch histograms in audio and symbolic music information retrieval. In: Proceedings of the 3rd international conference on music information retrieval, pp 31–38
[171] Tzanetakis G, Essl G, Cook P (2002c) Human perception and computer extraction of beat strength. In: Proceedings of the international conference on digital audio effects (DAFx-02), pp 257–261
[172] Ultsch A (1993) Self-organizing neural networks for visualization and classification. In: Opitz O, Lausen B, Klar R (eds) Information and classification–concepts, methods, and applications, Springer, Berlin, pp 307–313
[173] Ultsch A (1996) Self organizing neural networks perform different from statistical k-means clustering. In: BMBF Statusseminar Künstliche Intelligenz, Neuroinformatik und Intelligente Systeme, München, pp 433–443
[174] Ultsch A, Mörchen F (2005) ESOM-Maps: Tools for clustering, visualization, and classification with emergent SOM. Tech. Rep. 46, Department of Mathematics and Computer Science, University of Marburg, Germany
[175] Van Trees H (2001) Detection, estimation, and modulation theory, Part I, reprint edn. Wiley-Interscience, Melbourne
[176] Vembu S, Baumann S (2005) A self-organizing map based knowledge discovery for music recommendation systems. In: Computer music modeling and retrieval, pp 119–229
[177] Vignoli F, Pauws S (2005) A music retrieval system based on user driven similarity and its evaluation. In: Proceedings of the 6th international conference on music information retrieval, pp 272–279
[178] Vignoli F, van Gulik R, van de Wetering H (2004) Mapping music in the palm of your hand, explore and discover your collection. In: Proceedings of the 5th International Conference on Music Information Retrieval, pp 409–414
[179] Viste H, Evangelista G (2001) Sounds source separation: Preprocessing for hearing aids and structured audio coding. In: Proceedings of the COST G-6 conference on digital audio effects (DAFX-01), Limerick, pp 67–70
[180] Viste H, Evangelista G (2002) An extension for source separation techniques avoiding beats. In: Proceedings of the 5th international conference on digital audio effects (DAFx-02), Hamburg, Germany, pp 71–75
[181] Wakefield G (1999) Mathematical representation of joint time-chroma distributions. In: Proceedings of the SPIE international symposium on optical science, engineering and instrumentation, Denver, Colorado, pp 637–645
[182] Walmsley P, Godsill S, Rayner P (1999) Polyphonic pitch tracking using joint Bayesian estimation of multiple frame parameters. In: IEEE workshop on applications of signal processing to audio and acoustics, New Paltz, pp 119–122
[183] Wapnick J and Ekholm E (1997). Expert consensus in solo voice performance evaluation. J Voice 11(4): 429–436
[184] Weihs C, Ligges U (2005) From local to global analysis of music time series. In: Morik K, Boulicaut JF, Siebes A (eds) Local pattern detection. Springer, Berlin, Lecture Notes in Artificial Intelligence 3539, pp 217–231
[185] Weihs C, Ligges U (2006) Parameter optimization in automatic transcription of music. In: Spiliopoulou M, Kruse R, Nürnberger A, Borgelt C, Gaul W (eds) From data and information analysis to knowledge engineering. Springer, Berlin, pp 740–747
[186] Weihs C, Berghoff S, Hasse-Becker P, Ligges U (2001) Assessment of purity of intonation in singing presentations by discriminant analysis. In: Kunert J, Trenkler G (eds) Mathematical statistics and biometrical applications. Josef Eul, Bergisch-Gladbach, pp 395–410
[187] Weihs C, Ligges U, Sommer K (2006a) Analysis of music time series. In: Rizzi A, Vichi M (eds) COMPSTAT 2006–proceedings in computational statistics. Physica Verlag, Heidelberg, pp 147–159
[188] Weihs C, Szepannek G, Ligges U, Luebke K, Raabe N (2006b) Local models in register classification by timbre. In: Batagelj V, Bock HH, Ferligoj A, Ziberna A (eds) Data science and classification, Springer, Berlin, pp 315–322
[189] West K, Cox S, Lamere P (2006) Incorporating machine-learning into music similarity estimation. In: Proceedings of the first ACM workshop on Audio and music computing multimedia (AMCMM). ACM, New York, pp 89–96
[190] Whiteley N, Cemgil A, Godsill S (2006) Bayesian modelling of temporal structure in musical audio. In: 7th international conference on music information retrieval, Victoria, Canada, pp 29–34
[191] Whittaker J (1990). Graphical models in applied multivariate statistics. Wiley, New York · Zbl 0732.62056
[192] Wolfe P, Godsill S and Ng WJ (2004). Bayesian variable selection and regularization for time-frequency surface estimation. J R Stat Soc: Ser B (Stat Methodol) 66(3): 575–589 · Zbl 1046.62028
[193] Zils A, Pachet F (2004) Automatic extraction of music descriptors from acoustic signals using EDS. In: Proceedings of the 116th AES Convention
[194] Zwicker E and Stevens S (1957). Critical bandwidths in loudness summation. J Acoust Soc Am 29(5): 548–557
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.