-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wavelet discussion, including normalization #165
Comments
@elybrand - since you've just been looking at The current wavelets have a couple possible normalizations. Basically, at some point we ended up with a discrepancy between the docs and the code. Without trying to dig out all the history of the code, I wondered if you had any quick thoughts on this, and on whether there is any reason normalizing by the amplitudes should be divided by 2 (which we don't currently do, but the docs used to say we did). |
So I have a couple of remarks. First, I just realized there is no way the user can specify the normalization method when they call compute_wavelet_transform. It just defaults to 'sss'! Second, I don't immediately see why you would need to divide by 2. My understanding of the two normalizations is as follows: L2 normalization L1 normalization from neurodsp import timefrequency as tf
import numpy as np
from neurodsp.utils import create_times
from neurodsp.sim import sim_combined
from neurodsp.plts.spectral import *
from neurodsp.plts.time_series import plot_time_series
import matplotlib.pyplot as plt
# Set the random seed, for consistency simulating data
np.random.seed(0)
# General setting for simulations
fs = 1000
n_seconds = 5
# Generate a times vector, for plotting
times = create_times(n_seconds, fs)
# Set the frequency in our simulated signal
freq = 6
# Set up simulation for a signal with an oscillaton + noise
components = {'sim_powerlaw' : {'exponent' : 0},
'sim_oscillation' : {'freq' : 5}}
components2 = {'sim_powerlaw' : {'exponent' : 0},
'sim_oscillation' : {'freq' : 10}}
variances = [0, 1]
# Simulate our signal with frequencies at 5 and 10 Hz.
sig = sim_combined(n_seconds, fs, components, variances) + sim_combined(n_seconds, fs, components2, variances)
# L2 normalized wavelets. Notice that the TF plot at frequency 10 is lower than that at 5.
fig, axes = plt.subplots(2,1, figsize=(8,7))
wt = tf.compute_wavelet_transform(sig, fs, [4,11,1], norm='sss')
axes[0].pcolormesh(times, np.arange(4,12), np.abs(wt.T), cmap='viridis', shading='gouraud')
axes[0].set_title("L2 Normalized Wavelet Transform")
# L1 normalized wavelets. Now the two frequencies in the TF plot are roughly the same.
wt2 = tf.compute_wavelet_transform(sig, fs, [4,11,1], norm='amp')
axes[1].pcolormesh(times, np.arange(4,12), np.abs(wt2.T), cmap='viridis', shading='gouraud')
axes[1].set_title("L1 Normalized Wavelet Transform") |
For findability, and attaching to the codebase, I'm copying in some notes from @elybrand:
And replies / notes by @rdgao:
The reply back from @elybrand is to say that the FIR interpretation of DWT is reasonable. Mathematicians use the intuition of thinking of DWT as bandpass filters. Given the use case, then the implementation should broadly be okay. |
Thanks a lot to @elybrand for detailed analysis and discussion here. I'm going to try to summarize and see if we can organize if & what the ToDo items are. Notes & questions:
Perhaps relevant context: I would say it's much common to use something like wavelet-timefrequency to examine changes across time, within frequencies. I don't think explicit comparisons of absolute power between frequencies is common, so while we want to represent these estimations as appropriately as possible, I can't say I'm totally worried about this exact quantitative comparison (and in real data, the whole 1/f thing gets in the way here anyways). |
@TomDonoghue This is a great summary. For now, I agree it is best to not add functionality for setting I do want to add a semantic point which adds to the discussion on normalization. Technically speaking Discrete Wavelet Transform (DWT)Discrete wavelet transforms discretize the scale and location parameters of the wavelets in a pre-determined way. They use dyadic scales (powers of 2) for the scale, and the translations of the wavelets are always integer multiples of the scale. There are certain benefits to this, namely the DWT preserves energy and minimizes redundancy of the "information content" in the wavelet coefficients. That's because the wavelets they use are orthogonal. This is a pretty stringent requirement, and not every continuous wavelet admits a DWT. Morelet wavelets, for example, don't admit a DWT. The downside to DWTs is that your discretization of time-frequency space is pretty coarse. Discretized Continuous Wavelet Transform (dCWT)On the other hand, a discretized continuous wavelet transform allows you to choose an arbitrary way of discretizing the time-frequency plane. The only thing that's discretized is the signal itself. You let the user take advantage of this when you let them feed in an array of frequencies. As I mentioned above, dCWT don't preserve energy. Consequentially, I believe the standard practice is to use 'norm='amp'' so that all frequencies/scales are normalized equally. |
Okay, cool, thanks for the info! Looking into and figuring out out wavelets has always been on my ToDo list, so this is a super helpful primer! It seems to me the ToDo items then are to:
|
so I actually had it in me that we've been doing DWT this whole time with predetermined scales and preserving signal energy, but just realized that makes no sense if we allow arbitrary specification of frequencies. herp derp berp. is it accurate to say that DWT is a subset of dCWT with more stringent parameter configurations, such that one could use |
Definitely.
Not quite. The discrete wavelet transform goes beyond discretizing which frequencies you look at. It also discretizes the translations of the wavelets. The current implementation of I find the wikipedia article of wavelet transform does a nice job of representing graphically what a DWT does. You can see that a DWT tiles the time frequency plane in such a way that the tiles do not overlap but still cover the space. That's what I mean when I say a DWT "minimizes redundancy". When I first learned about wavelets, I started with the discrete Haar wavelet transform. There, it's very obvious what's going on with the translations and dilations. Thinking about in vector language, a DWT is constructing an orthonormal basis of wavelets. The DWT then is just taking dot products of the signal with fixed wavelets that have been dilated and shifted at particular locations. A dCWT guarantees neither orthogonality nor that the wavelets form a basis. |
right, my understanding is that DWT uses a strictly orthogonal basis, but you have to do something funny with the sampling at each scale as well? |
Precisely! |
Issue for discussion of our wavelets.
Normalization
For wavelets, right now we have two normalizations available:
And these are implemented, as written in the code. The question is that, somewhere through the history, the thing fell out of sync, and the 'amp' used to say, in the docs, that this was all divided by two, even though the code did not do this.
What is somewhat unclear (to me, at least) is what the best wavelet normalizations are. Maybe it should be divided by two? So if anyone knows the secrets of wavelets, please throw out some info :).
The text was updated successfully, but these errors were encountered: