Spectral Analysis¶
FFT-based frequency-domain analysis: magnitude spectrum, power spectral density, phase spectrum, STFT, mel filterbanks, and MFCCs.
- pyminidsp.magnitude_spectrum(signal)[source]¶
Compute the magnitude spectrum of a real-valued signal.
- Returns:
numpy array of length N/2 + 1 containing |X(k)|.
Compute the magnitude spectrum of a real-valued signal.
Given a signal of length N, computes the FFT and returns
|X(k)|for each frequency bin. Because the input is real-valued, only the firstN/2 + 1bins are unique (from DC through Nyquist).To convert bin index k to a frequency in Hz:
freq_k = k * sample_rate / N
The output is not normalised by N. Divide each value by N for the standard DFT magnitude, or by
N/2(except DC and Nyquist) for single-sided amplitude.- Parameters:
signal – Input signal.
- Returns:
Array of length
N // 2 + 1.
signal = md.sine_wave(1024, freq=440.0, sample_rate=44100.0) mag = md.magnitude_spectrum(signal) # mag[0] = DC component magnitude # mag[k] = magnitude at frequency k * 44100 / 1024 # mag[512] = Nyquist frequency magnitude
- pyminidsp.power_spectral_density(signal)[source]¶
Compute the power spectral density (PSD) of a signal.
- Returns:
numpy array of length N/2 + 1 containing |X(k)|^2 / N.
Compute the power spectral density (PSD) — the periodogram estimator:
\[\text{PSD}[k] = \frac{|X(k)|^2}{N}\]While the magnitude spectrum tells you the amplitude at each frequency, the PSD tells you the power — useful for noise analysis, SNR estimation, and comparing signals of different lengths.
Parseval’s theorem (energy conservation):
PSD[0] + 2 * sum(PSD[1:N//2]) + PSD[N//2] == sum(x[n]**2)- Parameters:
signal – Input signal.
- Returns:
Array of length
N // 2 + 1.
- pyminidsp.phase_spectrum(signal)[source]¶
Compute the one-sided phase spectrum in radians.
- Returns:
numpy array of length N/2 + 1 with phase in [-pi, pi].
Compute the one-sided phase spectrum in radians.
Returns the instantaneous phase angle \(\phi(k) = \arg X(k)\) for each DFT bin, in the range \([-\pi, \pi]\).
Interpretation:
A pure cosine at an integer bin produces \(\phi = 0\).
A pure sine at the same bin produces \(\phi = -\pi/2\).
A time-delayed signal exhibits linear phase: \(\phi(k) = -2\pi k d / N\) where d is the delay in samples.
Note
Phase values at bins where the magnitude is near zero are numerically unreliable. Always examine
magnitude_spectrum()alongside the phase spectrum to identify significant bins.- Parameters:
signal – Input signal.
- Returns:
Array of length
N // 2 + 1, values in \([-\pi, \pi]\).
- pyminidsp.stft_num_frames(signal_len, n, hop)[source]¶
Compute the number of STFT frames.
Compute the number of STFT frames:
num_frames = (signal_len - n) // hop + 1 when signal_len >= n num_frames = 0 when signal_len < n
Use this to predict the shape of
stft()output.
- pyminidsp.stft(signal, n, hop)[source]¶
Compute the Short-Time Fourier Transform (STFT).
- Parameters:
signal – Input signal.
n – FFT window size.
hop – Hop size in samples.
- Returns:
2D numpy array of shape (num_frames, n//2+1) containing magnitudes.
Compute the Short-Time Fourier Transform (STFT).
Slides a Hanning-windowed FFT over the signal in steps of hop samples, producing a time-frequency magnitude matrix. For each frame, the function applies a Hanning window, computes the FFT, and stores
|X(k)|for bins0..N//2.The output is not normalised by N — divide each value by N for standard DFT magnitude.
- Parameters:
signal – Input signal.
n – FFT window size (must be >= 2).
hop – Hop size in samples (must be >= 1).
- Returns:
2D array of shape
(num_frames, n // 2 + 1).
signal = md.sine_wave(16000, freq=440.0, sample_rate=16000.0) spec = md.stft(signal, n=512, hop=128) # spec.shape == (num_frames, 257) # Convert bin k to Hz: freq_hz = k * sample_rate / n # Convert frame f to seconds: time_s = f * hop / sample_rate
- pyminidsp.mel_filterbank(n, sample_rate, num_mels=26, min_freq_hz=0.0, max_freq_hz=None)[source]¶
Build a mel-spaced triangular filterbank matrix.
- Parameters:
n – FFT size.
sample_rate – Sampling rate in Hz.
num_mels – Number of mel filters.
min_freq_hz – Lower frequency bound.
max_freq_hz – Upper frequency bound (defaults to sample_rate/2).
- Returns:
2D numpy array of shape (num_mels, n//2+1).
Build a mel-spaced triangular filterbank matrix using the HTK mel mapping:
\[\text{mel}(f) = 2595 \cdot \log_{10}\!\left(1 + \frac{f}{700}\right)\]- Parameters:
n – FFT size (must be >= 2).
sample_rate – Sampling rate in Hz.
num_mels – Number of mel filters.
min_freq_hz – Lower frequency bound in Hz.
max_freq_hz – Upper frequency bound (defaults to
sample_rate / 2).
- Returns:
2D array of shape
(num_mels, n // 2 + 1).
- pyminidsp.mel_energies(signal, sample_rate, num_mels=26, min_freq_hz=0.0, max_freq_hz=None)[source]¶
Compute mel-band energies from a single frame.
- Returns:
numpy array of length num_mels.
Compute mel-band energies from a single frame.
Processing steps: (1) apply an internal Hann window, (2) compute one-sided PSD bins via FFT, (3) apply mel filterbank weights and sum per band.
- Parameters:
signal – Input frame.
sample_rate – Sampling rate in Hz.
num_mels – Number of mel bands.
min_freq_hz – Lower frequency bound.
max_freq_hz – Upper frequency bound (defaults to
sample_rate / 2).
- Returns:
Array of length num_mels.
- pyminidsp.mfcc(signal, sample_rate, num_mels=26, num_coeffs=13, min_freq_hz=0.0, max_freq_hz=None)[source]¶
Compute MFCCs from a single frame.
- Parameters:
signal – Input frame.
sample_rate – Sampling rate in Hz.
num_mels – Number of mel bands.
num_coeffs – Number of cepstral coefficients to output.
min_freq_hz – Lower frequency bound.
max_freq_hz – Upper frequency bound (defaults to sample_rate/2).
- Returns:
numpy array of length num_coeffs.
Compute mel-frequency cepstral coefficients (MFCCs) from a single frame.
Conventions:
HTK mel mapping for filter placement.
Internal Hann windowing and one-sided PSD mel energies.
Natural-log compression:
log(max(E_mel, 1e-12)).DCT-II with HTK-C0 profile:
c0usessqrt(1/M)normalisation,c_n(n > 0) usessqrt(2/M), where M = num_mels.Coefficient C0 is written to
mfcc_out[0].
- Parameters:
signal – Input frame.
sample_rate – Sampling rate in Hz.
num_mels – Number of mel bands.
num_coeffs – Number of cepstral coefficients (must be in
[1, num_mels]).min_freq_hz – Lower frequency bound.
max_freq_hz – Upper frequency bound (defaults to
sample_rate / 2).
- Returns:
Array of length num_coeffs.
signal = md.sine_wave(512, freq=440.0, sample_rate=16000.0) coeffs = md.mfcc(signal, sample_rate=16000.0, num_mels=26, num_coeffs=13)