-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
inverse transform from logscale to linear scale stft #7
Comments
I need to double check first. Unfortunately it seems log frequency scale is not invertible mathematically. Only the STFT with the original frequency bin (k) scale is invertible, the moment you change the scale of the frequency bin spacing, the Fourier basis vectors are no longer orthogonal. That is why I think it would not be invertible. |
Thank you for discussion ! When I meant invertible, I did mean from STFT to STFT, across frequency scales. I had in mind maybe some kind of transpose (like when using a mel filter bank), possibly with some approximate. But I am either an expert with that .. interested to hear about your possible finds on it ! |
It is not invertible but the pseudo-inverse of the forward transform matrix is the way to go. See the back-propagable pseudo-inverse here. |
Thank you for pointing the torch.pinverse operator that I didn't know ! I seems straight-forward for inverting the mel-spectrogram, however since nnAudio computes STFT through 1d convolution kernels, I am not sure if that applies as well for inverting the log scale to the linear scale .. or computing the log scale through a matrix multiplication similar to mels and using the pseudo-inverse of this matrix ? |
Ah actually it's a bit different than for the mel-spectrogram, you're right. You can find some implementation about this in asteroid, where pseudo inverse can also be computed on the fly for each forward if you want learnable filters. |
Hi !
Your repo is a pretty awesome find, I am especially interested in using the stft in log frequency.
Mel operations I was already doing myself using torch.stft and librosa filterbanks, but the more .. the better to experiment with.
May I ask, is there any way to transform a stft computed on log frequency scale back to linear frequency scale please ?
The use case I consider is putting some waveforms into log frequency spectrograms, filtering it and then putting back to linear frequency to then use the inverse stft back to time domain.
Thanks !
The text was updated successfully, but these errors were encountered: