Spectral Analysis¶
FFT-based frequency-domain analysis: magnitude spectrum, power spectral density, phase spectrum, STFT, mel filterbanks, and MFCCs.
- pyminidsp.lowpass_brickwall(signal, cutoff_hz, sample_rate)[source]¶
Apply an FFT-based ideal (brickwall) lowpass filter.
Zeroes all frequency bins above the cutoff frequency. Operates by copying the signal, applying the filter in-place, and returning the result.
- Parameters:
- Returns:
numpy array of the same length as the input.
- Return type:
Apply an FFT-based ideal (brickwall) lowpass filter. Zeroes all frequency bins above the cutoff frequency. Unlike FIR filters, a brickwall filter has a perfectly sharp transition — no passband ripple, no transition band — but can introduce ringing (Gibbs phenomenon).
- Parameters:
- Returns:
Filtered signal (same length as input).
- Return type:
signal = md.sine_wave(4096, freq=440.0, sample_rate=44100.0) filtered = md.lowpass_brickwall(signal, cutoff_hz=1000.0, sample_rate=44100.0)
- pyminidsp.magnitude_spectrum(signal)[source]¶
Compute the magnitude spectrum of a real-valued signal.
- Returns:
numpy array of length N/2 + 1 containing magnitudes.
- Parameters:
signal (ArrayLike)
- Return type:
Compute the magnitude spectrum of a real-valued signal.
Given a signal of length N, computes the FFT and returns
|X(k)|for each frequency bin. Because the input is real-valued, only the firstN/2 + 1bins are unique (from DC through Nyquist).To convert bin index k to a frequency in Hz:
freq_k = k * sample_rate / N
The output is not normalised by N. Divide each value by N for the standard DFT magnitude, or by
N/2(except DC and Nyquist) for single-sided amplitude.- Parameters:
signal (ArrayLike) – Input signal.
- Returns:
Array of length
N // 2 + 1.- Return type:
signal = md.sine_wave(1024, freq=440.0, sample_rate=44100.0) mag = md.magnitude_spectrum(signal) # mag[0] = DC component magnitude # mag[k] = magnitude at frequency k * 44100 / 1024 # mag[512] = Nyquist frequency magnitude
- pyminidsp.power_spectral_density(signal)[source]¶
Compute the power spectral density (PSD) of a signal.
- Returns:
numpy array of length N/2 + 1 containing power values.
- Parameters:
signal (ArrayLike)
- Return type:
Compute the power spectral density (PSD) — the periodogram estimator:
\[\text{PSD}[k] = \frac{|X(k)|^2}{N}\]While the magnitude spectrum tells you the amplitude at each frequency, the PSD tells you the power — useful for noise analysis, SNR estimation, and comparing signals of different lengths.
Parseval’s theorem (energy conservation):
PSD[0] + 2 * sum(PSD[1:N//2]) + PSD[N//2] == sum(x[n]**2)
- pyminidsp.phase_spectrum(signal)[source]¶
Compute the one-sided phase spectrum in radians.
- Returns:
numpy array of length N/2 + 1 with phase in [-pi, pi].
- Parameters:
signal (ArrayLike)
- Return type:
Compute the one-sided phase spectrum in radians.
Returns the instantaneous phase angle \(\phi(k) = \arg X(k)\) for each DFT bin, in the range \([-\pi, \pi]\).
Interpretation:
A pure cosine at an integer bin produces \(\phi = 0\).
A pure sine at the same bin produces \(\phi = -\pi/2\).
A time-delayed signal exhibits linear phase: \(\phi(k) = -2\pi k d / N\) where d is the delay in samples.
Note
Phase values at bins where the magnitude is near zero are numerically unreliable. Always examine
magnitude_spectrum()alongside the phase spectrum to identify significant bins.
- pyminidsp.stft_num_frames(signal_len, n, hop)[source]¶
Compute the number of STFT frames.
Compute the number of STFT frames:
num_frames = (signal_len - n) // hop + 1 when signal_len >= n num_frames = 0 when signal_len < n
Use this to predict the shape of
stft()output.
- pyminidsp.stft(signal, n, hop)[source]¶
Compute the Short-Time Fourier Transform (STFT).
- Parameters:
- Returns:
2D numpy array of shape (num_frames, n//2+1) containing magnitudes.
- Return type:
Compute the Short-Time Fourier Transform (STFT).
Slides a Hanning-windowed FFT over the signal in steps of hop samples, producing a time-frequency magnitude matrix. For each frame, the function applies a Hanning window, computes the FFT, and stores
|X(k)|for bins0..N//2.The output is not normalised by N — divide each value by N for standard DFT magnitude.
- Parameters:
- Returns:
2D array of shape
(num_frames, n // 2 + 1).- Return type:
signal = md.sine_wave(16000, freq=440.0, sample_rate=16000.0) spec = md.stft(signal, n=512, hop=128) # spec.shape == (num_frames, 257) # Convert bin k to Hz: freq_hz = k * sample_rate / n # Convert frame f to seconds: time_s = f * hop / sample_rate
- pyminidsp.mel_filterbank(n, sample_rate, num_mels=26, min_freq_hz=0.0, max_freq_hz=None)[source]¶
Build a mel-spaced triangular filterbank matrix.
- Parameters:
- Returns:
2D numpy array of shape (num_mels, n//2+1).
- Return type:
Build a mel-spaced triangular filterbank matrix using the HTK mel mapping:
\[\text{mel}(f) = 2595 \cdot \log_{10}\!\left(1 + \frac{f}{700}\right)\]- Parameters:
- Returns:
2D array of shape
(num_mels, n // 2 + 1).- Return type:
- pyminidsp.mel_energies(signal, sample_rate, num_mels=26, min_freq_hz=0.0, max_freq_hz=None)[source]¶
Compute mel-band energies from a single frame.
- Returns:
numpy array of length num_mels.
- Parameters:
- Return type:
Compute mel-band energies from a single frame.
Processing steps: (1) apply an internal Hann window, (2) compute one-sided PSD bins via FFT, (3) apply mel filterbank weights and sum per band.
- Parameters:
- Returns:
Array of length num_mels.
- Return type:
- pyminidsp.mfcc(signal, sample_rate, num_mels=26, num_coeffs=13, min_freq_hz=0.0, max_freq_hz=None)[source]¶
Compute MFCCs from a single frame.
- Parameters:
signal (ArrayLike) – Input frame.
sample_rate (float) – Sampling rate in Hz.
num_mels (int) – Number of mel bands.
num_coeffs (int) – Number of cepstral coefficients to output.
min_freq_hz (float) – Lower frequency bound.
max_freq_hz (float | None) – Upper frequency bound (defaults to sample_rate/2).
- Returns:
numpy array of length num_coeffs.
- Return type:
Compute mel-frequency cepstral coefficients (MFCCs) from a single frame.
Conventions:
HTK mel mapping for filter placement.
Internal Hann windowing and one-sided PSD mel energies.
Natural-log compression:
log(max(E_mel, 1e-12)).DCT-II with HTK-C0 profile:
c0usessqrt(1/M)normalisation,c_n(n > 0) usessqrt(2/M), where M = num_mels.Coefficient C0 is written to
mfcc_out[0].
- Parameters:
signal (ArrayLike) – Input frame.
sample_rate (float) – Sampling rate in Hz.
num_mels (int) – Number of mel bands.
num_coeffs (int) – Number of cepstral coefficients (must be in
[1, num_mels]).min_freq_hz (float) – Lower frequency bound.
max_freq_hz (float | None) – Upper frequency bound (defaults to
sample_rate / 2).
- Returns:
Array of length num_coeffs.
- Return type:
signal = md.sine_wave(512, freq=440.0, sample_rate=16000.0) coeffs = md.mfcc(signal, sample_rate=16000.0, num_mels=26, num_coeffs=13)