Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: BSSEval v4 #272

Closed
wants to merge 39 commits into from
Closed

WIP: BSSEval v4 #272

wants to merge 39 commits into from

Conversation

faroit
Copy link

@faroit faroit commented Feb 2, 2018

Okay, here is some news regarding the source separation evaluation model:
@aliutkus and me did make a spontaneous rewrite of the bsseval method to fix some shortcomings in the principle of bsseval (see #270 and #271).

The BSSEval metrics, as implemented in the MATLAB toolboxes and its re-implementation here in mir_eval are widely used in the audio separation literature. One particularity of BSSEval is to compute the metrics after optimally matching the estimates to the true sources through linear distortion filters. This arguably allows the criteria to be robust to some linear mismatches. Apart from the optional computation of all possible permutations of the sources, this matching is the reason for most of the computation cost of BSSEval, especially considering it is done for each evaluation window when the metrics are computed on a framewise basis.

In this years SiSEC, we decided to drop the assumption that distortion filters could be varying over time, but considered instead they are fixed for the whole length of the track. First, this significantly reduces the computational cost for evaluation because matching needs to be done only once for the whole signal. Second, this introduces much more dynamics in the evaluation, because time-varying matching filters turn out to over-estimate performance. Third, this makes matching more robust, because true sources are not silent throughout the whole recording, while they often were for short windows.

Technically, we are offering a new evaluation function bss_eval() which covers all previous functionality (images, sources/ framewise/ global) and additionally allows to use the new v4 mode. We maintained compatibility to the old functions by wrapping bss_eval. The benefit is mainly that v4 is significantly faster. Depending on the frame size it can be up to 50 %.

Long story short.... we touched most of the code but are 100% backwards compatible (regression tests are passing) and now we consider the options how to get this code out as soon as possible. The reason is that we want to use this in our evaluation kit that we give out to the SiSEC participants as soon as possible (as early as next week).

Currently, we don't know if it would be a good idea to move the new version bsseval version directly into mir_eval. Here are my concerns:

  1. we cannot predict how often it will be used outside of sisec (even though this was often the case before).
  2. the code certainly needs to pass a bunch of code reviews very soon, @craffel, @bmcfee?

So maybe the best option is to have bsseval v4 in a separate python package first and then merge it into mir_eval later this year. @craffel are you okay with releasing a stripped down version of mir_eval (all modules removed, except for separation) under a different package name (probably bsseval) on pypi?

@craffel
Copy link
Collaborator

craffel commented Feb 2, 2018

Thanks for all your hard work. Provided the regression tests still pass, I am open to giving this some sort of fast-track, but I personally won't be able to do any kind of review until after Feb 9. It looks like this is still "WIP" though, I see commented-out code, early return statements, print statements, etc. in the tests. In principle I would be most happy to have there be a single reference implementation of bss_eval in Python, instead of having two separate versions (one in mir_eval, one outside).

@craffel are you okay with releasing a stripped down version of mir_eval (all modules removed, except for separation) on pypi?

I'm not sure what you mean by this or why it's helpful.

@faroit
Copy link
Author

faroit commented Feb 2, 2018

In principle I would be most happy to have there be a single reference implementation of bss_eval in Python, instead of having two separate versions (one in mir_eval, one outside).

I agree but for most other implementations in mir_eval there were existing reference implementations outside of mir_eval. So this would be something shiny and new here ;-)

@craffel are you okay with releasing a stripped down version of mir_eval (all modules removed, except for separation) on pypi?

I'm not sure what you mean by this or why it's helpful.

We would like to use bsseval v4 in the SiSEC evaluation as part of the python musdb parser. And we are running a bit out of time, so we have to basically use the WIP version for the evaluation campaign. I would therefore either copy the separation code into the parser or make a separate bsseval package. Having participants install our fork directly from github seems not a great option. Do you think there are better options?

@craffel
Copy link
Collaborator

craffel commented Feb 2, 2018

I would therefore either copy the separation code into the parser or make a separate bsseval package. Having participants install our fork directly from github seems not a great option. Do you think there are better options?

I support you doing whatever is most convenient, but I do think it would be awkward if there ended up with two PIP packages -- bss_eval_v4 and mir_eval which contains bss_eval_v4. pip can install directly from a github repo/fork, maybe that's a decent option?
http://codeinthehole.com/tips/using-pip-and-requirementstxt-to-install-from-the-head-of-a-github-branch/

@carlthome
Copy link
Contributor

So maybe the best option is to have bsseval v4 in a separate python package first and then merge it into mir_eval later this year. @craffel are you okay with releasing a stripped down version of mir_eval (all modules removed, except for separation) under a different package name (probably bsseval) on pypi?

I personally don't mind installing mir_eval for running separation metrics as the full package is lightweight in terms of size and dependencies, and it's nice to not fragment the community too much.

@faroit
Copy link
Author

faroit commented Feb 6, 2018

@craffel okay, @aliutkus and me decided for now to copy the bsseval v4 into our musdb18 evaluation package to reduce confusion to users (installing a mir_eval fork from github is still a bit fragile for users not having virtual environments). Before start with the actual code review, it would be great if we can agree on the API changes and function namings so that we can easily switch so the new version of mir_eval in our evaluation tool, as soon as this PR is merged and a new release is issued:

  • do you like the idea of the new bss_eval() core function and the wrapper functions [bss_eval_sources..., bss_eval_images...` ] that keep mir_eval backwards compatible?
  • do the new flags framewise_filters and bsseval_sources_version make sense to you?

@craffel
Copy link
Collaborator

craffel commented Feb 12, 2018

do you like the idea of the new bss_eval() core function and the wrapper functions [bss_eval_sources..., bss_eval_images... ] that keep mir_eval backwards compatible?

Seems reasonable to me. As I said as long as things are backwards compatible and that there is community consensus that this way is "correct" I am ok.

do the new flags framewise_filters and bsseval_sources_version make sense to you?

Yes. In some cases in mir_eval when previous evaluation criteria is generally recognized as wrong, we go for the "right" way and carefully document why. I am open to considering this as one of those cases.

I would be interested to hear what the reaction of the larger SISEC community is to this.

@faroit
Copy link
Author

faroit commented Feb 12, 2018

I would be interested to hear what the reaction of the larger SISEC community is to this.

okay I then would suggest to wait for the LVA/ICA conference in summer. We will then have a good feedback from the community. Also we are in contact with Emmanuel Vincent so that there is a chance that bss_eval v4 will 'officially' replace the older version and be added to the bsseval website.

Until then v4 will live in https://github.com/sigsep/sigsep-mus-eval/blob/master/museval/metrics.py

I would suggest to close the PR for now and I will reopen it when we are ready.

@craffel
Copy link
Collaborator

craffel commented Feb 12, 2018

SG!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants