resample_multipitch improvement #336

cwitkowitz · 2021-02-13T22:26:00Z

Using the current functionality, if I try to resample the following pitch list,

which has no empty pitch observations (taken directly from a JAMS file), I get the following:

With the changes, specifying a tolerance of 50 ms, I get the following:

A good rule of thumb would be setting the tolerance (if you want to use it) to twice the minimum timing difference in your pitch list.

…ly) fills large gap with empty pitch estimates

craffel · 2021-02-15T15:38:29Z

@rabitt can you take a look?

rabitt · 2021-02-15T16:25:56Z

@cwitkowitz good catch on this bug! I think the bug boils down to a false assumption the function makes - that the time series is uniformly spaced / time stamps with no pitches are reported with an empty frequency list.

@craffel meta-question - do we want to force multipitch time series to be uniform like we do for melody? @cwitkowitz's solution is more general and would allow non-uniform time scales, but the properties of the metrics are really different if we do/don't report silent frames.

craffel · 2021-02-15T16:59:34Z

The high-level rule is to try to stick to external conventions, i.e. if in all past datasets/papers/competitions the multipitch time series are uniform, then we can assume that. But if there is existing convention that time series can be non-uniform, we should support that. Either way, we should document the convention. Does that make sense in this case?

rabitt · 2021-02-15T17:03:42Z

The high-level rule is to try to stick to external conventions, i.e. if in all past datasets/papers/competitions the multipitch time series are uniform, then we can assume that. But if there is existing convention that time series can be non-uniform, we should support that. Either way, we should document the convention. Does that make sense in this case?

I've never seen non-uniform time series for multipitch. and the mirex data from when I implemented this were uniform. Paper-wise, it's one of those details we all tend to leave out. All that said, I'd lean strongly towards uniform.

craffel · 2021-02-15T17:17:14Z

Ok, in that case, if the code is broken for non-uniform timescales, we should throw an exception when a non-uniform timescale is passed in.

cwitkowitz · 2021-02-15T17:19:44Z

Here is some information on my use-case. I am using GuitarSet, which has note and pitch annotations in JAMS. When evaluating MF0E performance, I would like to use the pitch annotations, as I believe they are more correct than converting the notes into discrete pitch activity during discrete times.

The pitch annotations are only specified for positive samples (i.e. times where a pitch is active). If I want to use that data with mir_eval, it needs to be converted into uniform sampling first. I thought this was a good place to put that conversion, however, I am open to other suggestions. Maybe the conversion to uniform is appropriate as its own function or somewhere in the JAMS repo.

craffel · 2021-02-15T17:21:13Z

We could add a separate utility function that converts non-uniform time series to uniform.

cwitkowitz · 2021-02-16T14:58:38Z

@rabitt / @craffel How does the following sound for a solution? Scrap what I have so far and replace it with a warning similar to that in the melody resampling function. This warns the user if the multipitch times are not uniform.

Then, I can create a utility function for making time-series uniform. In this function, I can estimate the true spacing by ignoring outliers and taking the mean of the differentials (something to that effect). Then I can re-project the original data onto the new collection of times, which would start at 0 and be spaced by the estimated spacing.

Still, this would not always handle sporadic data very well, e.g. if my positive samples are not spaced uniformly or have no consistent sampling in the first place. However, in the case of JAMS data, where pitch contours should be aligned with some sampling period, I believe it would work just fine.

It might also make sense to put the utility function in the JAMS repo, since JAMS pitch_contour seems to be the only case where this has ever been encountered.

craffel · 2021-02-16T15:13:56Z

This seems like a reasonable solution to me at a high level.

cwitkowitz · 2021-02-16T16:06:39Z

This seems like a reasonable solution to me at a high level.

Great. I can get started on this then.

rabitt · 2021-02-18T18:26:56Z

@cwitkowitz If you haven't gotten too far - I have a old script that does exactly this (filling in missing time stamps) if you want to use it as a starting point.

cwitkowitz · 2021-02-20T14:52:32Z

@rabitt - The script you linked would certainly work, however it requires that the sampling rate and hop length are known. In my case these values are not explicitly defined.

I went a little down the path of implementing what I mentioned above, i.e. estimating the hop length by taking the mean of first-order differentials wherever the second-order differential was close to zero. I wasn't sure what to do at that point. I can step from times[0] to times[-1] using this estimated hop length, but it doesn't necessarily line up perfectly, and it seems we would need 1d interpolation again. There is also no knowledge of the duration, so we can fill in empties from t=0 until times[0], but we cannot fill in empties from times[-1] to t=duration.

Any thoughts?

rabitt · 2021-02-22T14:06:02Z

@cwitkowitz

I went a little down the path of implementing what I mentioned above, i.e. estimating the hop length by taking the mean of first-order differentials wherever the second-order differential was close to zero. I wasn't sure what to do at that point. I can step from times[0] to times[-1] using this estimated hop length, but it doesn't necessarily line up perfectly, and it seems we would need 1d interpolation again.
There is also no knowledge of the duration, so we can fill in empties from t=0 until times[0], but we cannot fill in empties from times[-1] to t=duration.

Maybe it's safer to have hop_length and duration as arguments to the script - presumably that information should be part of the datasets the data comes from, and then there's no room for error. And maybe you could create a separate helper function called estimate_hop_length that returns a number that the user can use/not?

… and converting a time series to uniform.

cwitkowitz · 2021-02-23T18:20:31Z

@rabitt - how does this look? I can add test cases once we are happy with the changes.

rabitt

Looks great! One small comment on the hop size estimation

rabitt · 2021-03-01T11:08:00Z

mir_eval/util.py

+    if not np.sum(non_gaps):
+        raise ValueError("Time observations are too irregular.")
+
+    return np.diff(times)[non_gaps].mean()


I think .median() would be more stable, since in theory the spacing should be uniform. Then one or two outliers won't skew the hop size, which will over time propagate the error

cwitkowitz · 2021-03-02T17:00:57Z

Good suggestion @rabitt

I made that change and added a few other small changes necessary to get the test cases working.

Here a a few final considerations before completing the PR:

NumPy diff() function was not always giving me exact values, which is why I needed the A_TOL comparison rather than strict equality.
The time_series_to_uniform() function works only for observations specified by a list of ndarrays. This is fine for my purposes, but I am not sure if we would like the function to be more general for mir_eval as a whole.
In my test cases, I needed to compare frequencies, so I used repeat code from test_multipitch.py. I did not see any current examples where code within tests/ was being reused, so I wasn't sure if we wanted to start doing that or just ignore it for now.

…on was provided, but realized that in this case a hop length cannot be estimated and thus it does not make sense to support this case in time_series_to_uniform().

cwitkowitz · 2022-06-28T08:30:26Z

@rabitt - I recently came back to this and ran into an error where sometimes the final (original) observation, while still occurring within the specified duration, extends in time beyond what we represent in the new times without including an extra frame that goes over the specified duration. I am still a little bit unsure about the best way to handle this issue. In order to reproduce the error, time_series_to_uniform() can simply be ran with the JAMS data from source 3 of the track 01_BN1-129-Eb_solo in GuitarSet.

My confusion led me to check the mirdata package, to see what was being done there for GuitarSet (see my use-case in the above comment). I then realized that nearly the exact same functionality introduced with this PR in time_series_to_uniform() is supported in _fill_pitch_contour() of the GuitarSet wrapper in the mirdata package.

I also noticed that in the load_pitch_contour() function of the wrapper, the duration provided in the JAMS data is not being passed into _fill_pitch_contour(), which means that potentially 1-2 frames of original observations could be discarded due to the check if time > max_time or t_idx >= n_stamps: in _fill_pitch_contour(). If I am grasping this correctly, one frame may get discarded because of the issue I described above, and another frame may get discarded because the new times won't extend past the final time in the original observations (as opposed to the true duration of the file, which may be larger than the maximum time).

Coming back to this with a fresh look also made me begin to question if we really have the best solution here (in this PR and in mirdata). We are essentially just shifting the times of all of the observations without modifying the values in any way. Wouldn't it be more correct to do some sort of interpolation? I guess that is the point of the resample_multipitch() function in mir_eval, which we can't immediately use here due to the non-uniform time series in the JAMS data (the whole point of this PR).

Sorry for the really long and detailed comment. I needed to write all of this out to organize my own thoughts 😁.

Since some of this discussion involves mirdata, would it be better to bring any of this up in that repository? Also, would it make more sense to support this functionality within the JAMS package, so that it can be used in multiple different contexts?

cwitkowitz · 2022-07-11T09:07:14Z

I think the best solution here is simply to add an extra frame in all cases, even if this means the nearest frame on the uniform time grid for the last observation may occur slightly after the specified duration.

Added a layer of functionality to resample_multipitch which (optional…

cb87b08

…ly) fills large gap with empty pitch estimates

cwitkowitz added 2 commits February 22, 2021 17:57

Removed previous solution and added a means for estimating hop length…

7df90c3

… and converting a time series to uniform.

Corrected warning message

00b3fb2

rabitt approved these changes Mar 1, 2021

View reviewed changes

Small fixes and test cases

d0d3bf3

Originally intended this to work with an empty time array if a durati…

057bd76

…on was provided, but realized that in this case a hop length cannot be estimated and thus it does not make sense to support this case in time_series_to_uniform().

Always include an extra frame.

ce4ecb4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

resample_multipitch improvement #336

resample_multipitch improvement #336

cwitkowitz commented Feb 13, 2021

craffel commented Feb 15, 2021

rabitt commented Feb 15, 2021

craffel commented Feb 15, 2021

rabitt commented Feb 15, 2021

craffel commented Feb 15, 2021

cwitkowitz commented Feb 15, 2021

craffel commented Feb 15, 2021

cwitkowitz commented Feb 16, 2021

craffel commented Feb 16, 2021

cwitkowitz commented Feb 16, 2021

rabitt commented Feb 18, 2021

cwitkowitz commented Feb 20, 2021

rabitt commented Feb 22, 2021

cwitkowitz commented Feb 23, 2021

rabitt left a comment

rabitt Mar 1, 2021

cwitkowitz commented Mar 2, 2021 •

edited

Loading

cwitkowitz commented Jun 28, 2022

cwitkowitz commented Jul 11, 2022

resample_multipitch improvement #336

Are you sure you want to change the base?

resample_multipitch improvement #336

Conversation

cwitkowitz commented Feb 13, 2021

craffel commented Feb 15, 2021

rabitt commented Feb 15, 2021

craffel commented Feb 15, 2021

rabitt commented Feb 15, 2021

craffel commented Feb 15, 2021

cwitkowitz commented Feb 15, 2021

craffel commented Feb 15, 2021

cwitkowitz commented Feb 16, 2021

craffel commented Feb 16, 2021

cwitkowitz commented Feb 16, 2021

rabitt commented Feb 18, 2021

cwitkowitz commented Feb 20, 2021

rabitt commented Feb 22, 2021

cwitkowitz commented Feb 23, 2021

rabitt left a comment

Choose a reason for hiding this comment

rabitt Mar 1, 2021

Choose a reason for hiding this comment

cwitkowitz commented Mar 2, 2021 • edited Loading

cwitkowitz commented Jun 28, 2022

cwitkowitz commented Jul 11, 2022

cwitkowitz commented Mar 2, 2021 •

edited

Loading