EN | FR

Spectrogram

by Amélie Bernier-Robert and Ben Duinker
Timbre Lingo | Timbre and Orchestration Writings

Published: October 30, 2023 | How to cite

Just as a frequency spectrum makes it possible to analyze the partials and noise components that make up a sound, a spectrogram can be used to visualize the evolution of these components of a sound over time. The first modern device able to produce such a representation of sound was the “sound spectrograph,” developed at Bell Telephone Laboratories in the 1940s [1]. A spectrogram has a very intuitive way of displaying its parameters. Just as in a musical score, time is represented on the x-axis, frequency on the y-axis, and energy/amplitude is determined by the intensity/hue of the colors.

 

Figure 1: Spectrogram of common loon calls. We can see how the partials (horizontal strata created by the changes in intensity of the yellow hues) are equally spaced, indicating that these calls are harmonic. The lowest stratum (in the most intense yellow) roughly corresponds to the fundamental frequencies of these calls—the pitches at which we perceive them sounding.

 

The frequency axis (y-axis) of a spectrogram can be set to linear or logarithmic view.

 

Figure 2: Spectrogram of the same loon calls, but in logarithmic view. Notice the unequal spacing of the harmonics (horizontal yellow strata) when shown on a logarithmic scale. This display makes it possible to see more harmonics simultaneously without sacrificing detail in how the lowest partials are represented.

 

Software that produces a spectrogram must use Fast Fourier Transforms to analyze the audio signal it is rendering as visual (thus generating the strata of yellow hues seen in the spectrograms above). Because audio signals usually have high sampling rates (typically 44100 samples per second or higher), the software parcels these samples into windows so it can more efficiently analyze the signal. The resulting spectrogram thus represents the software’s analysis of these windows. Window size therefore refers to the number of samples analyzed within a given window. A smaller window size includes fewer samples, meaning the temporal resolution of the spectrogram will increase (because more windows would be analyzed per time unit). By contrast, a larger window size includes more samples in each window, decreasing the temporal resolution of the spectrogram. But an inverse relationship occurs for frequency resolution. Here, a smaller window, with its comparatively fewer number of samples, produces lower frequency resolution than a larger window with its comparatively greater number of samples. A Fast Fourier Transform analysis returns averaged values of the frequency components it measures within each window. This means that an analysis of larger windows might conceal evolving (in time) details of the audio signal, but would have more data (i.e., samples) to ensure that the detected and averaged frequency values are precise, accurate, and representative. 

The following spectrograms show the same common loon calls as above, but with shorter and longer window sizes: 

 

Figure 3: With a smaller window size (512 samples), the oscillating pattern of the loon calls, which begin at six seconds in the figure, are rendered in higher temporal resolution. The frequency values of these calls, again as seen in the yellow strata, are imprecise because of the relatively few samples within each window where they are averaged.

 
 

Figure 4: With a larger window size (8192 samples), the temporal resolution of the loon calls beginning at six seconds is greatly reduced, however, the frequencies of their components are rendered more precisely. 

 

Windows can be manipulated through other procedures, including overlapping (when adjacent windows overlap with one another, thus analyzing a portion of one another’s samples), modifications to window type (the “shape” of the window, as expressed by a mathematical function), and zero padding (increasing the number of samples in the analysis by adding zero values to the window content, which makes the spectrum look smoother by increasing the window size but doesn’t actually improve the frequency resolution) [2]

Many studies on timbre and its properties involve using and consulting spectrograms. There are even several free software that will let you make one yourself

[1] Koenig, W., Dunn, H. K. and Lacy, L. Y. (1946). The sound spectrograph. The Journal of the Acoustical Society of America, 18, 244. https://doi.org/10.1121/1.1902419

[2] Hill, P. R. (2019). Audio and Speech Processing with MATLAB. CRC Press.

Previous
Previous

Spectral Envelope

Next
Next

Timbre Space