2022/03/15

Python: Short-Time Fourier Transform (STFT) and its inverse transform (ISTFT) using librosa

Below shows an example using STFT to transform speech into the frequency domain and ISTFT to transform the spectral data back to waveforms in the time domain. The Librosa package is used.

import librosa

import soundfile as sf

x, sr = librosa.load('1.wav', sr=None)


n_fft = 256

hop_length = 128

win_length = 256


X = librosa.stft(x, n_fft=n_fft, hop_length=hop_length, win_length=win_length)


y = librosa.istft(X, n_fft=n_fft, hop_length=hop_length, win_length=win_length)


sf.write('1_istft.wav', y, 16000)

Results for x and y:



References:

Matlab: Short-Time Fourier Transform (STFT) and its inverse transform (ISTFT) (StudyEECC)

librosa.stft

librosa.istft

2022/03/10

Matlab: Short-Time Fourier Transform (STFT) and its inverse transform (ISTFT)

For sound processing in realtime scenarios, it is not possible to wait for a complete sound file. In this case, short-time processing is important.

Here is an example using STFT to transform speech into the frequency domain and inverse transform the spectral data back to waveforms in the time domain:


[S,F,T] = stft(x, fs,'Window',win2,'OverlapLength',128,'FFTLength',256); 

[y,ti] = istft(S,fs,'Window',win2,'OverlapLength',128,'FFTLength',256);

References:

stft - Short-time Fourier transform (MathWorks)

istft - Inverse short-time Fourier transform (MathWorks)

2022/03/09

Matlab: Discrete Cosine Transform (DCT) for speech processing

The DCT and inverse DCT may be used to convert a speech signal into the transform domain using real values and back to the speech waveform.

Example with Matlab:

%load speech

load mtlb

x = mtlb;

X = dct(x);

y = idct(X);

Results:


The values for x and y look the same, but MATLAB consider them as different.

>> isequal(x,y)

ans =

  logical

   0

For real-time processing, short-time DCT is required.

References:

Discrete Cosine Transform (Wikipedia)

dct (MathWorks)