Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Added HTM anomaly code with respect to Operate First cpu_usage data #9

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

suppathak
Copy link
Collaborator

@suppathak suppathak commented Nov 17, 2021

As a data scientist, working on HTM anomaly detection techniques, I want to create a jupyter notebook with the application of HTM-anomaly detection technique on a data from Operate first smaug cluster. I have included the notebook, dataset and a README file describing the process.

Feel free to provide feedback! Thank you.

Closes #5

@sesheta sesheta added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 17, 2021
@sesheta sesheta added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Nov 17, 2021
@suppathak
Copy link
Collaborator Author

/test pre-commit

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@suppathak suppathak changed the title [WIP] Added HTM anomaly code Added HTM anomaly code with respect to Operate First cpu_usage data Jan 27, 2022
@sesheta sesheta removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 27, 2022
@suppathak suppathak self-assigned this Jan 27, 2022
@@ -0,0 +1,809 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you printing this out? Its defined clearly in the above cell.

What might be useful here is some explanation for these Parameters and how their values were selected.


Reply via ReviewNB

@@ -0,0 +1,809 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #19.    AnomalyLikelihood()

Is this line doing anything?


Reply via ReviewNB

@@ -0,0 +1,809 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #3.    predictor = Predictor(steps=[1, 5], alpha=parameters["predictor"]["sdrc_alpha"])

The parameters defined in this notebook appear to be those selected for the gymdata.csv dataset used in the example hotgym.py is there any work that needs to be done on our part to ensure these parameters are correct for the cpu dataset?

Why are steps 1 and 5 used in the predictor? Why not other values?


Reply via ReviewNB

@@ -0,0 +1,809 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we interpret these outputs?


Reply via ReviewNB

@@ -0,0 +1,809 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain a bit better what is being plotted here? What is 1 Step Prediction vs 5 Step? Why is there no 10 Step prediction? What are the Instantaneous and Likelihood anomalies? Are they the same algorithm with different amounts of training time? Or are they different from each other? Why is "Anomaly Likelihood considered to be the best predictor of Anomaly." ? Are there any performance metrics to back this up?


Reply via ReviewNB

@@ -0,0 +1,809 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you space out the sublplots? The titles and x-axis labels are running into each other. You might also want to use seaborn, as it creates slightly nicer graphs.


Reply via ReviewNB

@@ -0,0 +1,809 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we know this method is good? Are there any baselines to compare it with? For example, does HTM significantly outperform just flagging all changes in value greater than 3x standard deviation over a rolling window as an anomaly? Is there anything else we can compare it to to justify the claim "the model does a good job"?


Reply via ReviewNB

@MichaelClifford MichaelClifford requested review from chauhankaranraj and oindrillac and removed request for durandom, tumido and pacospace January 27, 2022 20:37
@@ -0,0 +1,809 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small typo, anomalpus -> anomalous 


Reply via ReviewNB

@@ -0,0 +1,809 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is a naive question, but is it not possible to read the data with df = pd.read_csv("df_cpu.csv")?


Reply via ReviewNB

@@ -0,0 +1,809 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #13.    len(records)

Hmm inspecting the csv file on jupyterlab shows 576 rows, but here I see 574 records, any idea what might cause this inconsistency?


Reply via ReviewNB

@chauhankaranraj
Copy link
Member

I have included the notebook, dataset and a README file describing the process.

Hey @suppathak, so I see that in addition to notebook, README, and the csv, there's also a bunch of markdown docs and images being added in this PR. Is that intentional or were this supposed to be in another PR?

@suppathak
Copy link
Collaborator Author

suppathak commented Jan 28, 2022

Hey @suppathak, so I see that in addition to notebook, README, and the csv, there's also a bunch of markdown docs and images being added in this PR. Is that intentional or were this supposed to be in another PR?

Hey @chauhankaranraj , Thanks for the comments. I will work on them. The rest of the markdown docs are from another PR and are already accepted to the master branch. This may be due to some merging error in this branch. However, after doing the rebase, the error is now sorted. Thanks :)

@suppathak suppathak changed the title Added HTM anomaly code with respect to Operate First cpu_usage data [WIP] Added HTM anomaly code with respect to Operate First cpu_usage data Feb 21, 2022
@sesheta sesheta added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 21, 2022
@sesheta
Copy link

sesheta commented Apr 1, 2022

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please ask for approval from suppathak after the PR has been reviewed.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sesheta
Copy link

sesheta commented Apr 1, 2022

@suppathak: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
aicoe-ci/prow/pre-commit 0e9ea6e link true /test pre-commit

Full PR test history. Your PR dashboard. Please help us and open an issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
4 participants