
A small C library of DSP (Digital Signal Processing) routines for audio applications. Builds and runs on Ubuntu and macOS. Windows is untested — pull requests welcome.
Read the full documentation – API reference, tutorials, and interactive examples.
AI agents: fetch llms-full.txt for the complete API reference and tutorials in a single markdown file.
What's in the box?
Spectral Analysis (minidsp.h)
- Magnitude spectrum – compute |X(k)| from a real signal using the FFT; the foundation of frequency-domain analysis.
- Power spectral density – compute |X(k)|^2 / N (periodogram); shows how signal power distributes across frequencies.
- Phase spectrum – compute arg(X(k)) in radians; reveals the timing of each frequency component and is a prerequisite for phase-vocoder effects.
- Spectrogram (STFT) – sliding-window FFT producing a time-frequency magnitude matrix; the standard tool for visualising time-varying audio.
- Mel filterbank – triangular filters spaced on the mel scale for perceptually motivated spectral features.
- MFCCs – mel-frequency cepstral coefficients from log mel energies via DCT-II (C0 included).
- Window functions – Hanning, Hamming, Blackman, rectangular, and Kaiser windows for FFT analysis trade-off studies.
Signal Generators (minidsp.h)
- Sine wave generator – pure tone at a given frequency and amplitude; the "hello world" of DSP.
- White noise generator – Gaussian random samples with configurable seed; used to test filters and measure impulse responses.
- Impulse generator – single unit-sample spike at a given position; reveals a system's impulse response directly.
- Chirp generators – linear and logarithmic frequency sweeps; great for testing filter magnitude response across a frequency range.
- Square wave generator – bipolar square wave at a given frequency; demonstrates odd harmonics and Gibbs phenomenon.
- Sawtooth wave generator – linear ramp waveform at a given frequency; contains both odd and even harmonics.
- Shepard tone generator – the auditory illusion of endlessly rising or falling pitch, using octave-spaced sine waves with a Gaussian spectral envelope.
Voice Activity Detection (minidsp.h)
- Frame-at-a-time VAD – combines five normalized audio features (energy, zero-crossing rate, spectral entropy, spectral flatness, band energy ratio) into a weighted score with adaptive EMA normalization, onset gating, and hangover smoothing.
Spectrogram Text Art (minidsp.h)
- Spectrogram text synthesis – render a text string as audio whose spectrogram displays the message; uses a built-in 5x7 bitmap font with sine-wave synthesis, raised-cosine crossfade, and peak normalisation.
DTMF (minidsp.h)
- DTMF tone generation – synthesise multi-digit dial-tone sequences (dual sine pairs at standard row/column frequencies) with configurable timing.
- DTMF tone detection – sliding-window FFT detector with ITU-T Q.24 timing constraints; returns decoded digits with onset/offset timestamps.
Audio Steganography (minidsp.h)
- LSB steganography – hide messages or binary data in the lowest bit of 16-bit PCM samples; high capacity, inaudible distortion (~-90 dB).
- Frequency-band steganography – hide data as near-ultrasonic BFSK tones (18.5/19.5 kHz); lower capacity but more robust to noise.
- Spectrogram-text steganography – hybrid: LSB data encoding + spectrogram text art in the 18–23.5 kHz band; message is machine-decodable and visually readable in any spectrogram viewer.
Filters (biquad.h)
Seven classic audio filter types, all based on Robert Bristow-Johnson's Audio EQ Cookbook:
- Low-pass, High-pass, Band-pass, Notch
- Peaking EQ, Low shelf, High shelf
Delay Estimation (minidsp.h)
- GCC-PHAT – estimate the time delay between two microphone signals using Generalized Cross-Correlation with Phase Transform. This is the core of acoustic source localisation.
Signal Analysis (minidsp.h)
- RMS – root mean square, the standard signal loudness measure.
- Zero-crossing rate – fraction of adjacent samples that change sign; simple proxy for pitch and noisiness.
- Autocorrelation – normalised self-similarity at different lags; foundation of pitch detection.
- Peak detection – find local maxima above a threshold with minimum-distance suppression.
- F0 estimation (autocorrelation) – estimate pitch by finding the dominant autocorrelation lag in a frequency range.
- F0 estimation (FFT peak-pick) – estimate pitch from the dominant Hann-windowed spectral peak.
- Signal mixing – weighted sum of two signals; needed for any multi-source demo.
Simple Effects (minidsp.h)
- Delay line / echo – circular-buffer delay with feedback; the building block of many audio effects.
- Tremolo – amplitude modulation by a low-frequency oscillator.
- Comb-filter reverb – feedback comb filter introducing a reverb-like decay tail.
FIR Filters / Convolution (minidsp.h)
- Time-domain convolution – direct full linear convolution for teaching and validation.
- Moving-average filter – simplest causal FIR low-pass with zero-padded startup.
- General FIR filter – apply arbitrary tap coefficients to build custom FIR responses.
- FFT overlap-add convolution – fast full convolution for longer kernels.
- Lowpass FIR design – Kaiser-windowed sinc lowpass filter with configurable cutoff and stopband attenuation.
Sample Rate Conversion (minidsp.h)
- Polyphase sinc resampler – high-quality offline resampling between arbitrary sample rates (e.g., 44100 to 48000 Hz) using a 512-phase Kaiser-windowed sinc interpolation filter with >100 dB stopband attenuation.
- Math utilities – zeroth-order modified Bessel function (I₀) and normalized sinc function, useful as standalone building blocks.
Signal Measurement (minidsp.h)
- Signal measurements – energy, power, power in dB, normalised entropy.
- Scaling & AGC – linear range mapping, automatic gain control.
File I/O (fileio.h)
- Read audio files in any of the 20+ formats supported by libsndfile (WAV, FLAC, AIFF, OGG, and more)
- Write audio to WAV (IEEE float for lossless DSP round-trips)
- Write feature vectors in NumPy .npy format (for Python interop)
- Write feature vectors in safetensors format (for ML pipelines)
- Write feature vectors in HTK binary format (deprecated)
Live Audio I/O (liveio.h)
- Record from the microphone and play back to speakers via PortAudio
- Non-blocking API with callback support
Build and Test
Dependencies
Install the following libraries before building:
| Library | Purpose | Debian/Ubuntu | macOS (Homebrew) |
| FFTW3 | Fast Fourier Transform | apt install libfftw3-dev | brew install fftw |
| PortAudio | Live audio I/O | apt install portaudio19-dev | brew install portaudio |
| libsndfile | Audio file reading | apt install libsndfile1-dev | brew install libsndfile |
| Doxygen | API docs generation (optional) | apt install doxygen | brew install doxygen |
| Apple container | Linux container testing (optional) | — | Install from GitHub |
The Makefiles auto-detect Homebrew paths on macOS (both Apple Silicon and Intel).
Compile the library
make # builds libminidsp.a
Run the test suite
make test # builds and runs all tests
Test inside an Ubuntu container
To verify the library builds and passes all tests on Linux (Ubuntu 24.04):
make container-test # builds image, then runs make test inside the container
This requires the Apple container CLI on macOS.
Generate API documentation
make docs # generates HTML docs in docs/html
Install git hooks
A pre-push hook is included that runs make test and make container-test before allowing pushes to main:
Use in your project
Install the dependencies listed below, then clone and build:
git clone https://github.com/wooters/miniDSP.git
cd miniDSP
make
This produces libminidsp.a in the repo root.
A minimal program – generate a sine wave and find its peak frequency bin:
#include <stdio.h>
int main(void) {
double signal[1024];
double mag[1024 / 2 + 1];
unsigned peak = 0;
for (unsigned k = 1; k < 1024 / 2 + 1; k++)
if (mag[k] > mag[peak]) peak = k;
printf("Peak bin: %u (%.1f Hz)\n", peak, peak * 16000.0 / 1024);
}
A mini library of DSP (Digital Signal Processing) routines.
void MD_sine_wave(double *output, unsigned N, double amplitude, double freq, double sample_rate)
Generate a sine wave.
void MD_shutdown(void)
Free all internally cached FFT plans and buffers.
void MD_magnitude_spectrum(const double *signal, unsigned N, double *mag_out)
Compute the magnitude spectrum of a real-valued signal.
Save the code above as my_program.c.
Compile it directly:
gcc -std=c17 -Ipath/to/miniDSP/include my_program.c \
-Lpath/to/miniDSP -lminidsp -lfftw3 -lm -o my_program
Or use a Makefile (adapts to Homebrew on macOS automatically):
MINIDSP_DIR = path/to/miniDSP
CC = gcc
CFLAGS = -std=c17 -Wall -Wextra -I$(MINIDSP_DIR)/include
LDFLAGS = -L$(MINIDSP_DIR)
LDLIBS = -lminidsp -lfftw3 -lm
BREW_PREFIX := $(shell brew --prefix 2>/dev/null)
ifneq ($(BREW_PREFIX),)
CFLAGS += -I$(BREW_PREFIX)/include
LDFLAGS += -L$(BREW_PREFIX)/lib
endif
my_program: my_program.c
$(CC) $(CFLAGS) $(LDFLAGS) $< $(LDLIBS) -o $@
If you use fileio.h for reading or writing audio files, add -lsndfile to LDLIBS.
Quick examples
For step-by-step walkthroughs of these and other topics, see the Tutorials in the full documentation.
Detect the delay between two signals
double mic_a[4096], mic_b[4096];
printf("Signal B is %d samples behind signal A\n", delay);
@ PHAT
Phase Transform weighting (sharper peaks, more robust to noise).
int MD_get_delay(const double *siga, const double *sigb, unsigned N, double *ent, unsigned margin, int weightfunc)
Estimate the delay between two signals.
Compute the magnitude spectrum
double signal[1024];
unsigned num_bins = 1024 / 2 + 1;
double *mag = malloc(num_bins * sizeof(double));
free(mag);
A full example with Hanning windowing is in examples/magnitude_spectrum.c. Run it to generate an interactive HTML plot (Plotly.js + D3.js):
make -C examples plot
open examples/magnitude_spectrum.html # interactive: zoom, pan, hover for values
For a step-by-step walkthrough of the DSP concepts, see the Magnitude Spectrum tutorial.
Compute the power spectral density
double signal[1024];
unsigned num_bins = 1024 / 2 + 1;
double *psd = malloc(num_bins * sizeof(double));
free(psd);
void MD_power_spectral_density(const double *signal, unsigned N, double *psd_out)
Compute the power spectral density (PSD) of a real-valued signal.
A full example with Hanning windowing and one-sided PSD conversion is in examples/power_spectral_density.c. See the PSD tutorial for a detailed explanation.
Compute a spectrogram (STFT)
double signal[32000];
unsigned N = 512;
unsigned hop = 128;
unsigned num_bins = N / 2 + 1;
double *mag = malloc(num_frames * num_bins * sizeof(double));
MD_stft(signal, 32000, N, hop, mag);
free(mag);
void MD_stft(const double *signal, unsigned signal_len, unsigned N, unsigned hop, double *mag_out)
Compute the Short-Time Fourier Transform (STFT) of a real-valued signal.
unsigned MD_stft_num_frames(unsigned signal_len, unsigned N, unsigned hop)
Compute the number of STFT frames for the given signal length and parameters.
A full example generating an interactive HTML heatmap is in examples/spectrogram.c. See the Spectrogram tutorial for a step-by-step explanation.
Estimate fundamental frequency (F0)
double frame[1024];
double f0_fft =
MD_f0_fft(frame, 1024, 16000.0, 80.0, 400.0);
double MD_f0_fft(const double *signal, unsigned N, double sample_rate, double min_freq_hz, double max_freq_hz)
Estimate the fundamental frequency (F0) using FFT peak picking.
double MD_f0_autocorrelation(const double *signal, unsigned N, double sample_rate, double min_freq_hz, double max_freq_hz)
Estimate the fundamental frequency (F0) using autocorrelation.
A runnable frame-tracking example is in examples/pitch_detection.c. See the Pitch Detection tutorial for method comparison and visuals.
Generate and detect DTMF tones
#include <stdio.h>
const char *digits = "8675309";
double signal[len];
for (unsigned i = 0; i < n; i++)
printf(" %c %.3f–%.3f s\n", tones[i].digit, tones[i].start_s, tones[i].end_s);
unsigned MD_dtmf_detect(const double *signal, unsigned signal_len, double sample_rate, MD_DTMFTone *tones_out, unsigned max_tones)
Detect DTMF tones in an audio signal.
void MD_dtmf_generate(double *output, const char *digits, double sample_rate, unsigned tone_ms, unsigned pause_ms)
Generate a DTMF tone sequence.
unsigned MD_dtmf_signal_length(unsigned num_digits, double sample_rate, unsigned tone_ms, unsigned pause_ms)
Calculate the number of samples needed for MD_dtmf_generate().
A single detected DTMF tone with timing information.
A full example with WAV file I/O is in examples/dtmf_detector.c. See the DTMF tutorial.
Generate a Shepard tone (endlessly rising pitch)
#include <stdlib.h>
unsigned N = 5 * 44100;
double *sig = malloc(N * sizeof(double));
free(sig);
void MD_shepard_tone(double *output, unsigned N, double amplitude, double base_freq, double sample_rate, double rate_octaves_per_sec, unsigned num_octaves)
Generate a Shepard tone — the auditory illusion of endlessly rising or falling pitch.
A runnable example with spectrogram visualisation is in examples/shepard_tone.c. See the Shepard Tone tutorial.
Compute mel energies and MFCCs
double frame[1024];
double mel[26];
double mfcc[13];
MD_mfcc(frame, 1024, 16000.0, 26, 13, 80.0, 7600.0, mfcc);
void MD_mel_energies(const double *signal, unsigned N, double sample_rate, unsigned num_mels, double min_freq_hz, double max_freq_hz, double *mel_out)
Compute mel-band energies from a single frame.
void MD_mfcc(const double *signal, unsigned N, double sample_rate, unsigned num_mels, unsigned num_coeffs, double min_freq_hz, double max_freq_hz, double *mfcc_out)
Compute mel-frequency cepstral coefficients (MFCCs) from a single frame.
A runnable example is in examples/mel_mfcc.c. See the Mel/MFCC tutorial.
FIR filtering and convolution
double x[1024];
double h[] = {0.2, 0.6, 0.2};
double *y_time = malloc(ylen * sizeof(double));
double *y_fft = malloc(ylen * sizeof(double));
double y_fir[1024];
double y_ma[1024];
free(y_fft);
free(y_time);
void MD_convolution_fft_ola(const double *signal, unsigned signal_len, const double *kernel, unsigned kernel_len, double *out)
Full linear convolution using FFT overlap-add (offline).
unsigned MD_convolution_num_samples(unsigned signal_len, unsigned kernel_len)
Compute the output length of a full linear convolution.
void MD_fir_filter(const double *signal, unsigned signal_len, const double *coeffs, unsigned num_taps, double *out)
Apply a causal FIR filter with arbitrary coefficients.
void MD_convolution_time(const double *signal, unsigned signal_len, const double *kernel, unsigned kernel_len, double *out)
Time-domain full linear convolution (direct sum-of-products).
void MD_moving_average(const double *signal, unsigned signal_len, unsigned window_len, double *out)
Causal moving-average FIR filter with zero-padded startup.
A runnable example is in examples/fir_convolution.c. See the FIR/Convolution tutorial.
Filter audio with a low-pass biquad
for (int i = 0; i < num_samples; i++) {
output[i] =
BiQuad(input[i], lpf);
}
free(lpf);
Biquad (second-order IIR) filter interface.
smp_type BiQuad(smp_type sample, biquad *b)
Process a single sample through the filter and return the result.
biquad * BiQuad_new(int type, smp_type dbGain, smp_type freq, smp_type srate, smp_type bandwidth)
Create and initialise a new biquad filter.
State and coefficients for a single biquad filter section.
Tools
Standalone programs built on miniDSP.
mel_viz – Mel-Spectrum Audio Visualizer
Browser-based radial animation driven by mel-spectrum analysis. Supports WAV file playback and live microphone input, with four color palettes and real-time visual knobs.
make tools
./tools/mel_viz/mel_viz samples/punchy_slap_bass_30s.wav -o /tmp/viz
cd /tmp/viz && python3 -m http.server 8000
# open http://localhost:8000
The samples/ directory contains audio files ready to use with mel_viz. See the mel_viz documentation for details.
audio_steg – Audio Steganography Tool
Hide and recover secret messages or binary data (text, images) in WAV files. Supports LSB, frequency-band, and spectrogram-text steganography methods with auto-detection on decode.
make tools
./tools/audio_steg/audio_steg --encode lsb "secret message" -i host.wav -o stego.wav
./tools/audio_steg/audio_steg --decode stego.wav
resample – Sample Rate Converter
Convert a mono audio file to a different sample rate using polyphase sinc interpolation. Reads any format supported by libsndfile; writes WAV (IEEE float).
make tools
./tools/resample/resample input.wav 48000 output.wav
./tools/resample/resample -z 64 -b 14.0 input.wav 16000 output.wav # custom quality
Optional flags control the sinc interpolation filter quality:
- -z N — Number of sinc zero-crossings per side (default: 32). More zero-crossings produce a sharper cutoff and better stopband rejection at the cost of speed.
- -b F — Kaiser window beta parameter (default: 10.0). Higher values widen the mainlobe but deepen stopband attenuation (10.0 gives >100 dB rejection).
The defaults work well for most use cases.
Optimization
VAD Hyperparameter Tuning
The VAD default parameters shipped by MD_vad_default_params() were optimized via a 300-trial Optuna search on LibriVAD train-clean-100 (7 560 files, 9 noise types, 6 SNR levels), maximizing F2 (beta=2):
| Metric | Baseline | Optimized |
| F2 | 0.837 | 0.933 |
| Precision | 0.835 | 0.782 |
| Recall | 0.838 | 0.981 |
The optimizer uses a TPE (Tree-structured Parzen Estimator) sampler — a Bayesian algorithm that builds a probabilistic model of which parameters produce good results, making each successive trial smarter than random search.
See the VAD tutorial guide for full methodology, per-condition results, and parameter analysis. To re-tune on your own data, see optimize/VAD/README.md. Requires uv.
Python Bindings
Python bindings for miniDSP are available in the pyminidsp repository.