Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Multivariate threshold using Mahalanobis distance #234

Merged
merged 10 commits into from
Jul 26, 2023
Merged

Conversation

ab93
Copy link
Member

@ab93 ab93 commented Jul 25, 2023

Introduce a Multivariate thresholding technique using Mahalanobis distance and Chebyshev's inequality.

Comparison added here: numaproj/numalogic-benchmarks#2

@ab93 ab93 added area/ml Machine Learning based changes enhancement New feature or request labels Jul 25, 2023
@ab93 ab93 self-assigned this Jul 25, 2023
ab93 added 3 commits July 25, 2023 15:18
Signed-off-by: Avik Basu <[email protected]>
Signed-off-by: Avik Basu <[email protected]>
@codecov
Copy link

codecov bot commented Jul 25, 2023

Codecov Report

Merging #234 (0ebe64e) into main (466681b) will increase coverage by 0.10%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #234      +/-   ##
==========================================
+ Coverage   96.33%   96.43%   +0.10%     
==========================================
  Files          46       47       +1     
  Lines        1962     2020      +58     
  Branches      157      161       +4     
==========================================
+ Hits         1890     1948      +58     
  Misses         52       52              
  Partials       20       20              
Files Changed Coverage Δ
numalogic/registry/dynamodb_registry.py 88.30% <ø> (ø)
numalogic/config/factory.py 97.91% <100.00%> (ø)
numalogic/models/threshold/__init__.py 100.00% <100.00%> (ø)
numalogic/models/threshold/_mahalanobis.py 100.00% <100.00%> (ø)

... and 1 file with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@ab93 ab93 marked this pull request as ready for review July 25, 2023 20:53
@ab93 ab93 requested a review from vigith July 25, 2023 20:56
Mahalanobis distance vector
"""
x_distance = x - self._distr_mean
mahal_grid = x_distance @ self._cov_inv @ x_distance.T
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x_distance is a matrix of shape (n_obs, n_features). _cov_inv is (n_features, n_features), mahal_grid is (n_obs, n_obs). Therefore, it works on a matrix, not a number.

Copy link

@qhuai qhuai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While there are some minor discussion ongoing, it looks good overall.

Copy link
Contributor

@s0nicboOm s0nicboOm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comments

numalogic/models/threshold/_mahalanobis.py Show resolved Hide resolved
numalogic/models/threshold/_mahalanobis.py Show resolved Hide resolved
@ab93 ab93 merged commit dc18442 into main Jul 26, 2023
10 checks passed
@ab93 ab93 deleted the mahalanobis branch July 26, 2023 19:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ml Machine Learning based changes enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants