1. score_pairs refactor #333

mvargas33 · 2021-10-01T15:37:52Z

In the aim of making metric learn compatible with similarity learners like OASIS and pseudo-distance Mahalanobis learners (all current learners) I take a step forward deprecating score_pairs with a FutureWarning, and replacing it with pair_score and pair_distance, as discussed in #329 .

After thinking a while between pairwise_similarity and pair_similarity names, I think that using ''pairwise'' gives the user the intuition of similarity between all pairs, so a matrix is expected as output, not a list. Just like sklearn's pairwise_distances (See here).

Right now it does not makes too much sense to have pair_distances and pair_similarity only for Mahalanobis learners because pair_similarity it's just the inverse of the distance. But, this PR is needed to proceed with #329 and then with #330.

As discussed before, this change makes the code very extendible for new types of learners, for instance a StringEdit learner could fit in the library seamlessly extending from BaseLearner regardless the algorithm learns a similarity or a pseudo-distance.

It also does not affect Classifiers that much, as they can work with pair_distance from now on.

In contrast to this proposal, using isinstance() for each kind of learner inside the current score_pairs does not look like a good solution, in my humble opinion.

Ps: If the discussion goes in favor, I'll check all current tests, one by one, to make sure they don't become broken.

…ibraiton

mvargas33 · 2021-10-06T12:39:13Z

The general functionality is ready. What is left to be done is to produce a better documentation. I'm on it.

bellet · 2021-10-12T14:23:04Z

@mvargas33 is this ready to review?

mvargas33 · 2021-10-12T14:48:59Z

@mvargas33 is this ready to review?

Yes, it is ! The code and the docs are ready

perimosocordiae

I haven't done a full review yet, but overall I'm +1 for the idea.

metric_learn/base_metric.py

bellet

Here's a full review!

In general, please run spellcheck to avoid too many typos :-)

doc/introduction.rst

doc/supervised.rst

test/test_pairs_classifiers.py

bellet · 2021-10-18T09:33:16Z

test/test_sklearn_compat.py

@@ -148,7 +148,7 @@ def test_array_like_inputs(estimator, build_dataset, with_preprocessor):
  pairs = np.array([[X[0], X[1]], [X[0], X[2]]])
  pairs_variants, _ = generate_array_like(pairs)
  for pairs_variant in pairs_variants:
-    estimator.score_pairs(pairs_variant)
+    estimator.pair_distance(pairs_variant)


Not all metric learners will have pair_distance implemented
so use pair_score instead
perhaps you can test pair_distance only when it is implemented

Now its tested with pair_score, but it also checks that pair_distance can be used.

In case a learner does not have pair_distance, then the exception must match that it cannot be used with that specific learner.

But as this PR does not contemplate any learner non-mahalnobian, the exception msg is empty as a To-Do for #329

bellet · 2021-10-18T09:34:04Z

test/test_utils.py

@@ -834,8 +834,8 @@ def test_error_message_tuple_size(estimator, _):

 @pytest.mark.parametrize('estimator, _', metric_learners,
                         ids=ids_metric_learners)
-def test_error_message_t_score_pairs(estimator, _):
-  """tests that if you want to score_pairs on triplets for instance, it returns
+def test_error_message_t_pair_distance(estimator, _):


same as before: pair_distance will not always be implemented

Generalized with a try/catch. If there is an exception, it can only happen if pair_distance is not implemented. Did this in test_utils.py and test_sklearn_compat.py.

test/test_utils.py

bellet

Thanks @mvargas33, still some small things to fix but overall looks good now!
@perimosocordiae do you want to take a pass before we merge?

doc/supervised.rst

doc/weakly_supervised.rst

bellet · 2021-10-19T15:16:43Z

metric_learn/base_metric.py

+    Returns the similarity score between pairs of points. Depending on the
+    algorithm, this method can return the learned similarity score between
+    pairs, or the opposite of the distance learned between pairs. The larger
+    the score, the more similar the pair. All learners have access to this


nitpick: this is a bit heavy. I would recommend simply:
"Returns the similarity score between pairs of points (the larger the score, the more similar the pair). For metric learners that learn a distance, the score is simply the opposite of the distance between pairs."

Changed and kept the last sentence "All learners have access to this method".

bellet · 2021-10-19T15:17:53Z

metric_learn/base_metric.py

-      1D arrays and cannot use a preprocessor. Besides, the returned function
-      is independent of the metric learner and hence is  not modified if the
-      metric learner is.
+      two points. The difference between `pair_score` and `pair_distance` is


my bad, this is in the "see also" of pair_score, so it should only be "The difference with pair_score"

It's actually of the old score_pairs, so I changed it for "The difference with score_pairs".

pair_score and pair_distance 's "See Also" are ok

metric_learn/base_metric.py

test/test_mahalanobis_mixin.py

metric_learn/base_metric.py

bellet · 2021-10-19T15:48:08Z

test/test_pairs_classifiers.py

 def test_raise_not_fitted_error_if_not_fitted(estimator, build_dataset,
                                              with_preprocessor):
  """Test that a NotFittedError is raised if someone tries to use
-  score_pairs, decision_function, get_metric, transform or
+  pair_score, score_pairs, decision_function, get_metric, transform or
  get_mahalanobis_matrix on input data and the metric learner
  has not been fitted."""
  input_data, labels, preprocessor, _ = build_dataset(with_preprocessor)
  estimator = clone(estimator)
  estimator.set_params(preprocessor=preprocessor)
  set_random_state(estimator)
-  with pytest.raises(NotFittedError):
+  with pytest.raises(NotFittedError):  # Remove in 0.8.0
    estimator.score_pairs(input_data)
+  with pytest.raises(NotFittedError):
+    estimator.pair_score(input_data)
  with pytest.raises(NotFittedError):
    estimator.decision_function(input_data)
  with pytest.raises(NotFittedError):


Something to keep in mind for #329: some of these "raise_not_fitted" tests should actually be moved to base metric / Mahalanobis mixin / bilinear mixin tests. Indeed some of these methods will be present for all metric learners, some of them only for Mahalanobis but not bilinear (eg, transform), etc. The tests for pairs_classifiers (and other classifiers) should only be for methods specific to the associated class

This was raised some time ago in #160 and #136 . The time to refactor the tests structure is finally here. I'll look forward to create another PR regarding only this.

bellet

LGTM, thanks @mvargas33!
Let's wait a bit to see if others have some time to review before we merge.

test/test_base_metric.py

test/test_sklearn_compat.py

bellet · 2021-10-21T12:05:35Z

Merged, thanks @mvargas33!

mvargas33 and others added 2 commits October 4, 2021 17:42

Remove 3.9 from compatibility

c47797c

First draft of refactoring BaseMetricLearner and Mahalanobis Learner

e07b11a

mvargas33 force-pushed the score-deprecation branch from 2cf4075 to e07b11a Compare October 4, 2021 15:43

mvargas33 added 4 commits October 6, 2021 12:07

Avoid warning related to score_pairs deprecation in tests of pair_cal…

8210acd

…ibraiton

Minor fix

11b5df6

Replaced score_pairs with pair_distance in tests

06b7131

Replace score_pairs with pair_distance inb docs.

d5cb8b4

mvargas33 mentioned this pull request Oct 6, 2021

2. [MRG] Bilinear similarity #329

Open

mvargas33 added 6 commits October 8, 2021 11:27

Fix weird commit

2f61e7b

Update classifiers to use pair_similarity

5f68ed2

Updated rst docs

3d6450b

Fix identation

7bce493

Update docs of score_pairs, get_metric

7e6584a

Add deprecation Test. Fix identation

0b58f45

mvargas33 changed the title ~~[WIP] score_pairs refactor~~ score_pairs refactor Oct 11, 2021

Merge branch 'master' into score-deprecation

d4d3a9c

mvargas33 changed the title ~~score_pairs refactor~~ 1. score_pairs refactor Oct 13, 2021

perimosocordiae requested changes Oct 14, 2021

View reviewed changes

metric_learn/base_metric.py Outdated Show resolved Hide resolved

metric_learn/base_metric.py Outdated Show resolved Hide resolved

mvargas33 mentioned this pull request Oct 14, 2021

[MRG][DOC] Fixes almost all warnings in the docs #338

Merged

bellet requested changes Oct 18, 2021

View reviewed changes

mvargas33 added 5 commits October 19, 2021 11:40

Merge remote-tracking branch 'upstream/master' into score-deprecation

60c88a6

Fixed changes requested 1

8c55970

Fixed changes requested 2

787a8d1

Add equivalence test, p_dist == p_score

e14f956

Fix tests and identation.

0941a32

mvargas33 requested a review from bellet October 19, 2021 14:44

bellet requested changes Oct 19, 2021

View reviewed changes

mvargas33 requested a review from bellet October 20, 2021 09:26

Fixed changes requested 3

b019d85

mvargas33 requested a review from perimosocordiae October 20, 2021 09:27

Fix identation

74df897

bellet approved these changes Oct 20, 2021

View reviewed changes

perimosocordiae approved these changes Oct 20, 2021

View reviewed changes

test/test_base_metric.py Outdated Show resolved Hide resolved

test/test_sklearn_compat.py Outdated Show resolved Hide resolved

mvargas33 added 2 commits October 21, 2021 10:41

Last requested changes

c62a4e7

Last small detail

249e0fe

bellet merged commit aaf8d44 into scikit-learn-contrib:master Oct 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1. score_pairs refactor #333

1. score_pairs refactor #333

mvargas33 commented Oct 1, 2021

mvargas33 commented Oct 6, 2021

bellet commented Oct 12, 2021

mvargas33 commented Oct 12, 2021

perimosocordiae left a comment

bellet left a comment

bellet Oct 18, 2021

mvargas33 Oct 19, 2021

bellet Oct 18, 2021

mvargas33 Oct 19, 2021

bellet left a comment

bellet Oct 19, 2021

mvargas33 Oct 20, 2021

bellet Oct 19, 2021

mvargas33 Oct 20, 2021

bellet Oct 19, 2021

mvargas33 Oct 20, 2021

bellet left a comment

bellet commented Oct 21, 2021

1. score_pairs refactor #333

1. score_pairs refactor #333

Conversation

mvargas33 commented Oct 1, 2021

mvargas33 commented Oct 6, 2021

bellet commented Oct 12, 2021

mvargas33 commented Oct 12, 2021

perimosocordiae left a comment

Choose a reason for hiding this comment

bellet left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bellet left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bellet left a comment

Choose a reason for hiding this comment

bellet commented Oct 21, 2021