|
miniDSP
A small C library for audio DSP
|
A Shepard tone is the auditory equivalent of an M.C. Escher staircase: a sound that seems to rise (or fall) in pitch forever without ever actually leaving its frequency range. The effect was first described by cognitive scientist Roger Shepard in 1964.
miniDSP provides a single-call generator in src/minidsp_generators.c, demonstrated in examples/shepard_tone.c.
Build and run from the repository root:
The illusion rests on two ideas:
Several sine waves are sounded simultaneously, each separated by one octave. All of them glide upward (or downward) in pitch at the same rate. As a tone approaches the upper edge of the Gaussian, it fades to silence. Meanwhile, a new tone enters at the lower edge, fading in. Because the only thing the ear can latch onto (the loudest tones) are always in the middle and always going up, the sound appears to ascend indefinitely.
This diagram shows the principle for a rising Shepard tone:
At time \(t = n / f_s\), the output sample is:
\[ x[n] \;=\; A_\text{norm}\,\sum_k\; \underbrace{ \exp\!\Bigl(-\frac{d_k(t)^2}{2\sigma^2}\Bigr) }_{\text{Gaussian envelope}} \;\sin\!\bigl(\varphi_k(n)\bigr) \]
where:
| Symbol | Meaning |
|---|---|
| \(k\) | Layer index (one per octave) |
| \(d_k(t) = k - c + R\,t\) | Octave distance from the Gaussian centre at time \(t\) |
| \(c = (L-1)/2\) | Centre of the layer range |
| \(\sigma = L/4\) | Gaussian width in octaves |
| \(R\) | Glissando rate (rate_octaves_per_sec): positive = rising |
| \(L\) | Number of audible octave layers (num_octaves) |
| \(f_k(t) = f_\text{base} \cdot 2^{d_k(t)}\) | Instantaneous frequency of layer \(k\) |
| \(\varphi_k(n)\) | Phase accumulated sample-by-sample from \(f_k\) |
| \(A_\text{norm}\) | Normalisation factor so peak amplitude equals the requested amplitude |
The Gaussian is fixed in log-frequency space while the tones glide through it. Only layers within \(\pm 5\sigma\) are computed; layers whose frequency exceeds the Nyquist frequency or falls below 20 Hz are silently skipped.
The core synthesis loop — what MD_shepard_tone() does internally:
Controls how fast the tones rise or fall.
| Rate | Effect |
|---|---|
| 0.0 | Static chord — no movement, just octave-spaced sines |
| 0.25 | Slow, dreamy ascent (4 seconds per octave) |
| 0.5 | Moderate rise (default) — 2 seconds per octave |
| 1.0 | Fast rise — 1 second per octave |
| −0.5 | Moderate descent — the "falling" Shepard tone |
Listen — rising at 0.5 oct/s (5 seconds):
Listen — falling at 0.5 oct/s (5 seconds):
Listen — static chord (5 seconds):
The spectrogram below shows the characteristic pattern of a rising Shepard tone: parallel diagonal lines (one per octave layer) sweeping upward through the Gaussian bell curve. Tones fade in at the bottom and fade out at the top.
The falling variant mirrors the rising pattern — diagonal lines sweep downward.
Controls how many simultaneous octave layers are present and the width of the Gaussian envelope ( \(\sigma = L/4\)).
| Value | Typical use |
|---|---|
| 4–6 | Narrow bell — prominent entry/exit of tones; more "organ-like" |
| 8 | Default — smooth, balanced illusion |
| 10–12 | Wide bell — very gradual fading; ethereal, diffuse texture |
More octaves means more layers span the audible range at any instant, making the transitions smoother at the expense of a busier spectrum.
The centre of the Gaussian bell curve. Tones above this frequency are treated the same as those below it (the Gaussian is symmetric in log-frequency space). Typical values: 200–600 Hz.
Parameters:
| Parameter | Description |
|---|---|
output | Caller-allocated buffer for the synthesised audio. |
N | Number of samples to generate. Must be > 0. |
amplitude | Peak amplitude of the output signal. |
base_freq | Centre frequency of the Gaussian envelope in Hz (e.g. 440). |
sample_rate | Sample rate in Hz. Must be > 0. |
rate_octaves_per_sec | Glissando rate: positive = rising, negative = falling, 0 = static. |
num_octaves | Number of audible octave layers (Gaussian width). Must be > 0. |
The example examples/shepard_tone.c generates a WAV file and an interactive HTML spectrogram.
Usage:
Default: rising at 0.5 oct/s, 440 Hz base, 8 octaves, 5 seconds.
Generate and listen:
Build and run:
The Shepard tone exploits a fundamental ambiguity in pitch perception. Pitch has two dimensions:
The Gaussian envelope removes pitch-height cues: there is no single "highest" or "lowest" tone to anchor the percept. All the listener hears is the chroma — and the chroma is always going up.
The effect is even more striking when a rising Shepard tone is followed by a falling one. Despite the falling version being physically the mirror image, many listeners perceive both as rising — a dramatic demonstration of how expectation shapes perception.
Jean-Claude Risset later extended the idea to continuous glissando (the Shepard–Risset glissando), which is exactly what MD_shepard_tone() implements: instead of discrete steps, the tones slide smoothly.