Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding L2 norm technique #236

Conversation

martin-gaievski
Copy link
Member

Description

Adding L2 normalization technique for normalization processor. Essentially L2 norm is "score" divided by square root of sum of score squares (more details here). Example of pipeline with processor config:

{
    "description": "Post processor for hybrid search",
    "phase_results_processors": [
        {
            "normalization-processor": {
                "normalization": {
                    "technique": "l2"
                }
            }
        }
    ]
}

Issues Resolved

#228, part of solution for #126

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed as per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Martin Gaievski <[email protected]>
Signed-off-by: Martin Gaievski <[email protected]>
@codecov
Copy link

codecov bot commented Jul 28, 2023

Codecov Report

Merging #236 (c80ba2f) into feature/normalization (fe72dbc) will decrease coverage by 3.53%.
The diff coverage is 40.00%.

@@                     Coverage Diff                     @@
##             feature/normalization     #236      +/-   ##
===========================================================
- Coverage                    85.95%   82.43%   -3.53%     
- Complexity                     310      323      +13     
===========================================================
  Files                           24       26       +2     
  Lines                          904      979      +75     
  Branches                       137      153      +16     
===========================================================
+ Hits                           777      807      +30     
- Misses                          67      108      +41     
- Partials                        60       64       +4     
Files Changed Coverage Δ
...ination/HarmonicMeanScoreCombinationTechnique.java 0.00% <0.00%> (ø)
...essor/normalization/ScoreNormalizationFactory.java 100.00% <ø> (ø)
...r/normalization/L2ScoreNormalizationTechnique.java 83.33% <83.33%> (ø)

Signed-off-by: Martin Gaievski <[email protected]>
@martin-gaievski martin-gaievski merged commit 6ad641a into opensearch-project:feature/normalization Jul 31, 2023
13 of 15 checks passed
Comment on lines +49 to +65
public float combine(final float[] scores) {
float combinedScore = 0.0f;
float weights = 0;
for (int indexOfSubQuery = 0; indexOfSubQuery < scores.length; indexOfSubQuery++) {
float score = scores[indexOfSubQuery];
if (score >= 0.0) {
float weight = getWeightForSubQuery(indexOfSubQuery);
score = score * weight;
combinedScore += score;
weights += weight;
}
}
if (weights == 0.0f) {
return ZERO_SCORE;
}
return combinedScore / weights;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weighted harmonic mean = sum(wi) / sum(1/(wi*si))
https://en.wikipedia.org/wiki/Harmonic_mean

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @HenryL27 , that's what I'm working now and it will be a new PR for harmonic mean and weighted geometric combination. Probably checked in this class incidentally in a half-ready form.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, no worries. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants