Spectrogram
EN | FR
Spectrogram
by
Timbre Lingo | Timbre and Orchestration Writings
Published: October 30, 2023 | How to cite
Just as a frequency spectrum makes it possible to analyze the partials and noise components that make up a sound, a spectrogram can be used to visualize the evolution of these components of a sound over time. The first modern device able to produce such a representation of sound was the “sound spectrograph,” developed at Bell Telephone Laboratories in the 1940s [1]. A spectrogram has a very intuitive way of displaying its parameters. Just as in a musical score, time is represented on the x-axis, frequency on the y-axis, and energy/amplitude is determined by the intensity/hue of the colors.
The frequency axis (y-axis) of a spectrogram can be set to linear or logarithmic view.
Software that produces a spectrogram must use Fast Fourier Transforms to analyze the audio signal it is rendering as visual (thus generating the strata of yellow hues seen in the spectrograms above). Because audio signals usually have high sampling rates (typically 44100 samples per second or higher), the software parcels these samples into windows so it can more efficiently analyze the signal. The resulting spectrogram thus represents the software’s analysis of these windows. Window size therefore refers to the number of samples analyzed within a given window. A smaller window size includes fewer samples, meaning the temporal resolution of the spectrogram will increase (because more windows would be analyzed per time unit). By contrast, a larger window size includes more samples in each window, decreasing the temporal resolution of the spectrogram. But an inverse relationship occurs for frequency resolution. Here, a smaller window, with its comparatively fewer number of samples, produces lower frequency resolution than a larger window with its comparatively greater number of samples. A Fast Fourier Transform analysis returns averaged values of the frequency components it measures within each window. This means that an analysis of larger windows might conceal evolving (in time) details of the audio signal, but would have more data (i.e., samples) to ensure that the detected and averaged frequency values are precise, accurate, and representative.
The following spectrograms show the same common loon calls as above, but with shorter and longer window sizes:
Windows can be manipulated through other procedures, including overlapping (when adjacent windows overlap with one another, thus analyzing a portion of one another’s samples), modifications to window type (the “shape” of the window, as expressed by a mathematical function), and zero padding (increasing the number of samples in the analysis by adding zero values to the window content, which makes the spectrum look smoother by increasing the window size but doesn’t actually improve the frequency resolution) [2].
Many studies on timbre and its properties involve using and consulting spectrograms. There are even several free software that will let you make one yourself!
[1] Koenig, W., Dunn, H. K. and Lacy, L. Y. (1946). The sound spectrograph. The Journal of the Acoustical Society of America, 18, 244. https://doi.org/10.1121/1.1902419
[2] Hill, P. R. (2019). Audio and Speech Processing with MATLAB. CRC Press.
On this page: