-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Multivariate threshold using Mahalanobis distance #234
Conversation
Signed-off-by: Avik Basu <[email protected]>
Signed-off-by: Avik Basu <[email protected]>
Signed-off-by: Avik Basu <[email protected]>
Signed-off-by: Avik Basu <[email protected]>
Codecov Report
@@ Coverage Diff @@
## main #234 +/- ##
==========================================
+ Coverage 96.33% 96.43% +0.10%
==========================================
Files 46 47 +1
Lines 1962 2020 +58
Branches 157 161 +4
==========================================
+ Hits 1890 1948 +58
Misses 52 52
Partials 20 20
... and 1 file with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Signed-off-by: Avik Basu <[email protected]>
Signed-off-by: Avik Basu <[email protected]>
Mahalanobis distance vector | ||
""" | ||
x_distance = x - self._distr_mean | ||
mahal_grid = x_distance @ self._cov_inv @ x_distance.T |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
x_distance
is a matrix of shape (n_obs, n_features). _cov_inv
is (n_features, n_features), mahal_grid
is (n_obs, n_obs). Therefore, it works on a matrix, not a number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While there are some minor discussion ongoing, it looks good overall.
Signed-off-by: Avik Basu <[email protected]>
Signed-off-by: Avik Basu <[email protected]>
Signed-off-by: Avik Basu <[email protected]>
Signed-off-by: Avik Basu <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comments
Introduce a Multivariate thresholding technique using Mahalanobis distance and Chebyshev's inequality.
Comparison added here: numaproj/numalogic-benchmarks#2