Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactored and optimized segmentation metrics eval #395

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

Mintas
Copy link

@Mintas Mintas commented Nov 26, 2024

Refactored and optimized segmentation metrics evaluation.

  1. Refactoring:
    key purpose is to make some of the methods more useful for extension in client libraries, make code more concise, reducing code duplications, thus also increasing possible community support

  2. Optimisation:
    here we note, that some code duplications lead us to recalculate some preparative steps for metric calculation. Reusing common logical steps helped to significantly increase performance. In our microbenchmark of segment.eval() we have achieved 1.58 speedup!
    This may be crucial in some applications that may require intensive metric calculation (like Hyperparameter Optimization)

  3. Benchmarking details:
    We took some symbolic music files (midis) with known expert markup of musical segments. There were 20 data points considered. We ran 300 iterations of eval method over all data provided to get better averaging.

Benchmarking results:
Before: time: 24.702 and latency: 0.0823, avg per data point: 0.0041
After: time: 15.615 and latency: 0.0520, avg per data point: 0.0026
Speedup is: 1.582 (~ -37%)

@Mintas
Copy link
Author

Mintas commented Dec 10, 2024

@bmcfee @craffel
This PR is getting stale, would you consider reviewing and merging it?
TY

@bmcfee
Copy link
Collaborator

bmcfee commented Dec 10, 2024

Sorry I won't have time to review this for a few weeks.

@bmcfee
Copy link
Collaborator

bmcfee commented Jan 22, 2025

Sorry for the delay, I'm looking at this now. It's not the easiest PR to review because there's a mixture of cosmetic changes (ie variable names) and refactoring to sift through. It also appears that tests are not passing.

return default_return_value
return func(reference_indices, estimated_indices, *args, **kwargs)
return wrapper_default_on_empty
return decorator_default_on_empty
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would probably be cleaner using the decorator module, as we do in util, eg

mir_eval/mir_eval/util.py

Lines 946 to 962 in 935451d

def deprecated(*, version, version_removed):
"""Mark a function as deprecated.
Using the decorated (old) function will result in a warning.
"""
def __wrapper(func, *args, **kwargs):
"""Warn the user, and then proceed."""
warnings.warn(
f"{func.__module__}.{func.__name__}\n\tDeprecated as of mir_eval version {version}."
f"\n\tIt will be removed in mir_eval version {version_removed}.",
category=FutureWarning,
stacklevel=3, # Would be 2, but the decorator adds a level
)
return func(*args, **kwargs)
return decorator(__wrapper)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good note, gotta update my solution!

@@ -458,58 +492,40 @@ def rand_index(
length (in seconds) of frames for clustering
(Default value = 0.1)
beta : float > 0
beta value for F-measure
(Default value = 1.0)
deprecated parameter - to be removed in 0.9 !!!
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this deprecated?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this parameter is unused in this method implementation, can’t find good reason to continue its support.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm, indeed. I wonder how this ended up there in the first place?

I agree that it makes sense to deprecate this. However, I think we should take that to a separate PR, and add in a bit of machinery to support deprecations better. (See, e.g., https://github.com/librosa/librosa/blob/ebd878fabb87c588f98f372a94fca5d167c59cf2/librosa/util/deprecation.py#L10-L15 )

else:
for i,m in enumerate(mapping):
accumulator[m] = values[i]
return accumulator
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what these functions do - it seems like abstraction that may not be necessary, and it definitely does not help code readability.

@@ -1227,71 +1215,39 @@ def evaluate(ref_intervals, ref_labels, est_intervals, est_labels, **kwargs):
# Now compute all the metrics
scores = collections.OrderedDict()

# Boundary detection
# Metrics for Intervals without structure labels section:
trim = kwargs.get('trim')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little unclear, since it's by default going to give trim is None rather than trim is False. The behavior will be the same (at least until we add type annotations), but it would be better to explicitly populate a boolean.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants