Spectral Analysis

FFT-based frequency-domain analysis: magnitude spectrum, power spectral density, phase spectrum, STFT, mel filterbanks, and MFCCs.

pyminidsp.lowpass_brickwall(signal, cutoff_hz, sample_rate)[source]

Apply an FFT-based ideal (brickwall) lowpass filter.

Zeroes all frequency bins above the cutoff frequency. Operates by copying the signal, applying the filter in-place, and returning the result.

Parameters:
  • signal (ArrayLike) – Input signal.

  • cutoff_hz (float) – Cutoff frequency in Hz.

  • sample_rate (float) – Sampling rate in Hz.

Returns:

numpy array of the same length as the input.

Return type:

ndarray[tuple[Any, …], dtype[float64]]

Apply an FFT-based ideal (brickwall) lowpass filter. Zeroes all frequency bins above the cutoff frequency. Unlike FIR filters, a brickwall filter has a perfectly sharp transition — no passband ripple, no transition band — but can introduce ringing (Gibbs phenomenon).

Parameters:
  • signal (ArrayLike) – Input signal.

  • cutoff_hz (float) – Cutoff frequency in Hz.

  • sample_rate (float) – Sampling rate in Hz.

Returns:

Filtered signal (same length as input).

Return type:

ndarray[tuple[Any, …], dtype[float64]]

signal = md.sine_wave(4096, freq=440.0, sample_rate=44100.0)
filtered = md.lowpass_brickwall(signal, cutoff_hz=1000.0, sample_rate=44100.0)
pyminidsp.magnitude_spectrum(signal)[source]

Compute the magnitude spectrum of a real-valued signal.

Returns:

numpy array of length N/2 + 1 containing magnitudes.

Parameters:

signal (ArrayLike)

Return type:

ndarray[tuple[Any, …], dtype[float64]]

Compute the magnitude spectrum of a real-valued signal.

Given a signal of length N, computes the FFT and returns |X(k)| for each frequency bin. Because the input is real-valued, only the first N/2 + 1 bins are unique (from DC through Nyquist).

To convert bin index k to a frequency in Hz:

freq_k = k * sample_rate / N

The output is not normalised by N. Divide each value by N for the standard DFT magnitude, or by N/2 (except DC and Nyquist) for single-sided amplitude.

Parameters:

signal (ArrayLike) – Input signal.

Returns:

Array of length N // 2 + 1.

Return type:

ndarray[tuple[Any, …], dtype[float64]]

signal = md.sine_wave(1024, freq=440.0, sample_rate=44100.0)
mag = md.magnitude_spectrum(signal)
# mag[0]   = DC component magnitude
# mag[k]   = magnitude at frequency k * 44100 / 1024
# mag[512] = Nyquist frequency magnitude
pyminidsp.power_spectral_density(signal)[source]

Compute the power spectral density (PSD) of a signal.

Returns:

numpy array of length N/2 + 1 containing power values.

Parameters:

signal (ArrayLike)

Return type:

ndarray[tuple[Any, …], dtype[float64]]

Compute the power spectral density (PSD) — the periodogram estimator:

\[\text{PSD}[k] = \frac{|X(k)|^2}{N}\]

While the magnitude spectrum tells you the amplitude at each frequency, the PSD tells you the power — useful for noise analysis, SNR estimation, and comparing signals of different lengths.

Parseval’s theorem (energy conservation): PSD[0] + 2 * sum(PSD[1:N//2]) + PSD[N//2] == sum(x[n]**2)

Parameters:

signal (ArrayLike) – Input signal.

Returns:

Array of length N // 2 + 1.

Return type:

ndarray[tuple[Any, …], dtype[float64]]

pyminidsp.phase_spectrum(signal)[source]

Compute the one-sided phase spectrum in radians.

Returns:

numpy array of length N/2 + 1 with phase in [-pi, pi].

Parameters:

signal (ArrayLike)

Return type:

ndarray[tuple[Any, …], dtype[float64]]

Compute the one-sided phase spectrum in radians.

Returns the instantaneous phase angle \(\phi(k) = \arg X(k)\) for each DFT bin, in the range \([-\pi, \pi]\).

Interpretation:

  • A pure cosine at an integer bin produces \(\phi = 0\).

  • A pure sine at the same bin produces \(\phi = -\pi/2\).

  • A time-delayed signal exhibits linear phase: \(\phi(k) = -2\pi k d / N\) where d is the delay in samples.

Note

Phase values at bins where the magnitude is near zero are numerically unreliable. Always examine magnitude_spectrum() alongside the phase spectrum to identify significant bins.

Parameters:

signal (ArrayLike) – Input signal.

Returns:

Array of length N // 2 + 1, values in \([-\pi, \pi]\).

Return type:

ndarray[tuple[Any, …], dtype[float64]]

pyminidsp.stft_num_frames(signal_len, n, hop)[source]

Compute the number of STFT frames.

Compute the number of STFT frames:

num_frames = (signal_len - n) // hop + 1   when signal_len >= n
num_frames = 0                              when signal_len < n

Use this to predict the shape of stft() output.

Parameters:
Return type:

int

pyminidsp.stft(signal, n, hop)[source]

Compute the Short-Time Fourier Transform (STFT).

Parameters:
  • signal (ArrayLike) – Input signal.

  • n (int) – FFT window size.

  • hop (int) – Hop size in samples.

Returns:

2D numpy array of shape (num_frames, n//2+1) containing magnitudes.

Return type:

ndarray[tuple[Any, …], dtype[float64]]

Compute the Short-Time Fourier Transform (STFT).

Slides a Hanning-windowed FFT over the signal in steps of hop samples, producing a time-frequency magnitude matrix. For each frame, the function applies a Hanning window, computes the FFT, and stores |X(k)| for bins 0..N//2.

The output is not normalised by N — divide each value by N for standard DFT magnitude.

Parameters:
  • signal (ArrayLike) – Input signal.

  • n (int) – FFT window size (must be >= 2).

  • hop (int) – Hop size in samples (must be >= 1).

Returns:

2D array of shape (num_frames, n // 2 + 1).

Return type:

ndarray[tuple[Any, …], dtype[float64]]

signal = md.sine_wave(16000, freq=440.0, sample_rate=16000.0)
spec = md.stft(signal, n=512, hop=128)
# spec.shape == (num_frames, 257)
# Convert bin k to Hz: freq_hz = k * sample_rate / n
# Convert frame f to seconds: time_s = f * hop / sample_rate
pyminidsp.mel_filterbank(n, sample_rate, num_mels=26, min_freq_hz=0.0, max_freq_hz=None)[source]

Build a mel-spaced triangular filterbank matrix.

Parameters:
  • n (int) – FFT size.

  • sample_rate (float) – Sampling rate in Hz.

  • num_mels (int) – Number of mel filters.

  • min_freq_hz (float) – Lower frequency bound.

  • max_freq_hz (float | None) – Upper frequency bound (defaults to sample_rate/2).

Returns:

2D numpy array of shape (num_mels, n//2+1).

Return type:

ndarray[tuple[Any, …], dtype[float64]]

Build a mel-spaced triangular filterbank matrix using the HTK mel mapping:

\[\text{mel}(f) = 2595 \cdot \log_{10}\!\left(1 + \frac{f}{700}\right)\]
Parameters:
  • n (int) – FFT size (must be >= 2).

  • sample_rate (float) – Sampling rate in Hz.

  • num_mels (int) – Number of mel filters.

  • min_freq_hz (float) – Lower frequency bound in Hz.

  • max_freq_hz (float | None) – Upper frequency bound (defaults to sample_rate / 2).

Returns:

2D array of shape (num_mels, n // 2 + 1).

Return type:

ndarray[tuple[Any, …], dtype[float64]]

pyminidsp.mel_energies(signal, sample_rate, num_mels=26, min_freq_hz=0.0, max_freq_hz=None)[source]

Compute mel-band energies from a single frame.

Returns:

numpy array of length num_mels.

Parameters:
  • signal (ArrayLike)

  • sample_rate (float)

  • num_mels (int)

  • min_freq_hz (float)

  • max_freq_hz (float | None)

Return type:

ndarray[tuple[Any, …], dtype[float64]]

Compute mel-band energies from a single frame.

Processing steps: (1) apply an internal Hann window, (2) compute one-sided PSD bins via FFT, (3) apply mel filterbank weights and sum per band.

Parameters:
  • signal (ArrayLike) – Input frame.

  • sample_rate (float) – Sampling rate in Hz.

  • num_mels (int) – Number of mel bands.

  • min_freq_hz (float) – Lower frequency bound.

  • max_freq_hz (float | None) – Upper frequency bound (defaults to sample_rate / 2).

Returns:

Array of length num_mels.

Return type:

ndarray[tuple[Any, …], dtype[float64]]

pyminidsp.mfcc(signal, sample_rate, num_mels=26, num_coeffs=13, min_freq_hz=0.0, max_freq_hz=None)[source]

Compute MFCCs from a single frame.

Parameters:
  • signal (ArrayLike) – Input frame.

  • sample_rate (float) – Sampling rate in Hz.

  • num_mels (int) – Number of mel bands.

  • num_coeffs (int) – Number of cepstral coefficients to output.

  • min_freq_hz (float) – Lower frequency bound.

  • max_freq_hz (float | None) – Upper frequency bound (defaults to sample_rate/2).

Returns:

numpy array of length num_coeffs.

Return type:

ndarray[tuple[Any, …], dtype[float64]]

Compute mel-frequency cepstral coefficients (MFCCs) from a single frame.

Conventions:

  • HTK mel mapping for filter placement.

  • Internal Hann windowing and one-sided PSD mel energies.

  • Natural-log compression: log(max(E_mel, 1e-12)).

  • DCT-II with HTK-C0 profile: c0 uses sqrt(1/M) normalisation, c_n (n > 0) uses sqrt(2/M), where M = num_mels.

  • Coefficient C0 is written to mfcc_out[0].

Parameters:
  • signal (ArrayLike) – Input frame.

  • sample_rate (float) – Sampling rate in Hz.

  • num_mels (int) – Number of mel bands.

  • num_coeffs (int) – Number of cepstral coefficients (must be in [1, num_mels]).

  • min_freq_hz (float) – Lower frequency bound.

  • max_freq_hz (float | None) – Upper frequency bound (defaults to sample_rate / 2).

Returns:

Array of length num_coeffs.

Return type:

ndarray[tuple[Any, …], dtype[float64]]

signal = md.sine_wave(512, freq=440.0, sample_rate=16000.0)
coeffs = md.mfcc(signal, sample_rate=16000.0, num_mels=26, num_coeffs=13)