A Shepard tone is the auditory equivalent of an M.C. Escher staircase: a sound that seems to rise (or fall) in pitch forever without ever actually leaving its frequency range. The effect was first described by cognitive scientist Roger Shepard in 1964.

miniDSP provides a single-call generator in src/minidsp_generators.c, demonstrated in examples/shepard_tone.c.

Build and run from the repository root:

make -C examples shepard_tone

cd examples && ./shepard_tone

How it works

The illusion rests on two ideas:

Octave equivalence — the human ear perceives tones one octave apart as the "same note" at a different pitch height.
Spectral envelope — a fixed Gaussian bell curve in log-frequency space controls how loud each tone is. Tones near the centre are loud; tones at the edges are nearly silent.

Several sine waves are sounded simultaneously, each separated by one octave. All of them glide upward (or downward) in pitch at the same rate. As a tone approaches the upper edge of the Gaussian, it fades to silence. Meanwhile, a new tone enters at the lower edge, fading in. Because the only thing the ear can latch onto (the loudest tones) are always in the middle and always going up, the sound appears to ascend indefinitely.

This diagram shows the principle for a rising Shepard tone:

Amplitude
  |                ╭───╮
  |              ╱       ╲          ← Gaussian spectral envelope
  |            ╱     ●     ╲          (fixed in log-frequency)
  |          ╱    ●     ●    ╲
  |        ╱   ●           ●   ╲
  |      ╱  ●                 ●  ╲
  |    ╱ ●                       ● ╲
  +--●─────────────────────────────●──→ log₂(frequency)
     ↑                               ↑
  low edge                       high edge
  (fading in)                  (fading out)
 
  ● = individual octave-spaced tones, all gliding →

Signal model

At time \(t = n / f_s\), the output sample is:

\[x[n] \;=\; A_\text{norm}\,\sum_k\; \underbrace{ \exp\!\Bigl(-\frac{d_k(t)^2}{2\sigma^2}\Bigr) }_{\text{Gaussian envelope}} \;\sin\!\bigl(\varphi_k(n)\bigr) \]

where:

Symbol	Meaning
\(k\)	Layer index (one per octave)
\(d_k(t) = k - c + R\,t\)	Octave distance from the Gaussian centre at time \(t\)
\(c = (L-1)/2\)	Centre of the layer range
\(\sigma = L/4\)	Gaussian width in octaves
\(R\)	Glissando rate (rate_octaves_per_sec): positive = rising
\(L\)	Number of audible octave layers (num_octaves)
\(f_k(t) = f_\text{base} \cdot 2^{d_k(t)}\)	Instantaneous frequency of layer \(k\)
\(\varphi_k(n)\)	Phase accumulated sample-by-sample from \(f_k\)
\(A_\text{norm}\)	Normalisation factor so peak amplitude equals the requested amplitude

The Gaussian is fixed in log-frequency space while the tones glide through it. Only layers within \(\pm 5\sigma\) are computed; layers whose frequency exceeds the Nyquist frequency or falls below 20 Hz are silently skipped.

Reading the formula in C

The core synthesis loop — what MD_shepard_tone() does internally:

// R -> rate,  L -> num_octaves,  f_base -> base_freq,  fs -> sample_rate
// c = (L - 1) / 2,  sigma = L / 4
// d_k(t) -> d,  f_k(t) -> freq,  phi_k(n) -> phases[k]
// x[n] -> output[i]
 
double c     = (double)(num_octaves - 1) / 2.0;
double sigma = (double)num_octaves / 4.0;
 
for (unsigned i = 0; i < N; i++) {
    double t = (double)i / sample_rate;
    double sample = 0.0;
    for (int k = k_min; k <= k_max; k++) {
        double d    = (double)k - c + rate * t;       // octave distance from centre
        double freq = base_freq * pow(2.0, d);        // instantaneous frequency
        double gauss = exp(-0.5 * d * d / (sigma * sigma));  // Gaussian weight
        phases[k] += 2.0 * M_PI * freq / sample_rate; // accumulate phase
        sample += gauss * sin(phases[k]);              // add weighted sine
    }
    output[i] = sample;  // (later normalised to peak amplitude)
}

Parameters and their effect

Glissando rate (rate_octaves_per_sec)

Controls how fast the tones rise or fall.

Rate	Effect
0.0	Static chord — no movement, just octave-spaced sines
0.25	Slow, dreamy ascent (4 seconds per octave)
0.5	Moderate rise (default) — 2 seconds per octave
1.0	Fast rise — 1 second per octave
−0.5	Moderate descent — the "falling" Shepard tone

Listen — rising at 0.5 oct/s (5 seconds):

Listen — falling at 0.5 oct/s (5 seconds):

Listen — static chord (5 seconds):

Rising spectrogram

The spectrogram below shows the characteristic pattern of a rising Shepard tone: parallel diagonal lines (one per octave layer) sweeping upward through the Gaussian bell curve. Tones fade in at the bottom and fade out at the top.

Falling spectrogram

The falling variant mirrors the rising pattern — diagonal lines sweep downward.

Number of octaves (num_octaves)

Controls how many simultaneous octave layers are present and the width of the Gaussian envelope ( \(\sigma = L/4\)).

Value	Typical use
4–6	Narrow bell — prominent entry/exit of tones; more "organ-like"
8	Default — smooth, balanced illusion
10–12	Wide bell — very gradual fading; ethereal, diffuse texture

More octaves means more layers span the audible range at any instant, making the transitions smoother at the expense of a busier spectrum.

Base frequency (base_freq)

The centre of the Gaussian bell curve. Tones above this frequency are treated the same as those below it (the Gaussian is symmetric in log-frequency space). Typical values: 200–600 Hz.

API

void MD_shepard_tone(double *output, unsigned N, double amplitude,
                     double base_freq, double sample_rate,
                     double rate_octaves_per_sec, unsigned num_octaves);

Parameters:

Parameter	Description
output	Caller-allocated buffer for the synthesised audio.
N	Number of samples to generate. Must be > 0.
amplitude	Peak amplitude of the output signal.
base_freq	Centre frequency of the Gaussian envelope in Hz (e.g. 440).
sample_rate	Sample rate in Hz. Must be > 0.
rate_octaves_per_sec	Glissando rate: positive = rising, negative = falling, 0 = static.
num_octaves	Number of audible octave layers (Gaussian width). Must be > 0.

Quick example

#include "minidsp.h"
#include <stdlib.h>
 
// 5 seconds of endlessly rising Shepard tone at 44.1 kHz
unsigned N = 5 * 44100;
double *sig = malloc(N * sizeof(double));
MD_shepard_tone(sig, N, 0.8, 440.0, 44100.0, 0.5, 8);
// sig[] now sounds like it rises forever
free(sig);

Example program

The example examples/shepard_tone.c generates a WAV file and an interactive HTML spectrogram.

Usage:

./shepard_tone [--rising | --falling | --static]
               [--rate OCTAVES_PER_SEC]
               [--octaves NUM]
               [--base FREQ_HZ]
               [--duration SEC]

Default: rising at 0.5 oct/s, 440 Hz base, 8 octaves, 5 seconds.

Generate and listen:

MD_shepard_tone(signal, N, 0.8, base_freq, sample_rate, rate, num_oct);

Build and run:

make -C examples shepard_tone
cd examples && ./shepard_tone
open shepard_tone.html     # interactive spectrogram

Why it works — the psychoacoustics

The Shepard tone exploits a fundamental ambiguity in pitch perception. Pitch has two dimensions:

Pitch chroma — which note it is (C, D, E, …), determined by the position within the octave.
Pitch height — how high or low it sounds overall.

The Gaussian envelope removes pitch-height cues: there is no single "highest" or "lowest" tone to anchor the percept. All the listener hears is the chroma — and the chroma is always going up.

The effect is even more striking when a rising Shepard tone is followed by a falling one. Despite the falling version being physically the mirror image, many listeners perceive both as rising — a dramatic demonstration of how expectation shapes perception.

Jean-Claude Risset later extended the idea to continuous glissando (the Shepard–Risset glissando), which is exactly what MD_shepard_tone() implements: instead of discrete steps, the tones slide smoothly.