zipfs law in short-time timbral codings of speech, music, and environmental sound signalszipf定律在短时场音色集中演讲,音乐,和环境声音信号.pdf
文本预览下载声明
Zipf’s Law in Short-Time Timbral Codings of Speech,
Music, and Environmental Sound Signals
´ 1 ` 1,2 1 ´ 3
Martın Haro *, Joan Serra , Perfecto Herrera , Alvaro Corral
1 Music Technology Group, Universitat Pompeu Fabra, Barcelona, Spain, 2 Artificial Intelligence Research Institute (IIIA-CSIC), Consejo Superior de Investigaciones
´ `
Cientıficas, Bellaterra, Barcelona, Spain, 3 Complex Systems Group, Centre de Recerca Matematica, Bellaterra, Barcelona, Spain
Abstract
Timbre is a key perceptual feature that allows discrimination between different sounds. Timbral sensations are highly
dependent on the temporal evolution of the power spectrum of an audio signal. In order to quantitatively characterize such
sensations, the shape of the power spectrum has to be encoded in a way that preserves certain physical and perceptual
properties. Therefore, it is common practice to encode short-time power spectra using psychoacoustical frequency scales. In
this paper, we study and characterize the statistical properties of such encodings, here called timbral code-words. In
particular, we report on rank-frequency distributions of timbral code-words extracted from 740 hours of audio coming from
disparate sources such as speech, music, and environmental sounds. Analogously to text corpora, we find a heavy-tailed
Zipfian distribution with exponent close to one. Importantly, this distribution is found independently of different encoding
decisions and regardless of the audio source. Further analysis on the intrinsic characteristics of most and least frequent
code-words reveals that the most frequent code-words tend to have a more homogeneou
显示全部