New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[WIP] Added HTM anomaly code with respect to Operate First cpu_usage data #9

Open

suppathak wants to merge 9 commits into aicoe-aiops:master from suppathak:anomaly-code

Collaborator

suppathak commented Nov 17, 2021 •

edited

Loading

As a data scientist, working on HTM anomaly detection techniques, I want to create a jupyter notebook with the application of HTM-anomaly detection technique on a data from Operate first smaug cluster. I have included the notebook, dataset and a README file describing the process.

Feel free to provide feedback! Thank you.

Closes #5

suppathak requested review from durandom and tumido as code owners

November 17, 2021 20:31

sesheta added the do-not-merge/work-in-progress label

sesheta requested a review from pacospace

November 17, 2021 20:31

sesheta added the size/XXL label

suppathak requested a review from MichaelClifford

November 17, 2021 20:32

suppathak force-pushed the anomaly-code branch from 483d020 to 8ff8a27 Compare

November 17, 2021 20:36

Collaborator Author

suppathak commented Nov 19, 2021

/test pre-commit

review-notebook-app bot commented Nov 22, 2021

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

suppathak changed the title ~~[WIP] Added HTM anomaly code~~ Added HTM anomaly code with respect to Operate First cpu_usage data

sesheta removed the do-not-merge/work-in-progress label

suppathak self-assigned this

MichaelClifford reviewed

View reviewed changes

notebooks/example/anomaly_code.ipynb

    
            @@ -0,0 +1,809 @@
          
              {

Member

MichaelClifford Jan 27, 2022

Why are you printing this out? Its defined clearly in the above cell.

What might be useful here is some explanation for these Parameters and how their values were selected.

Reply via ReviewNB

notebooks/example/anomaly_code.ipynb

    
            @@ -0,0 +1,809 @@
          
              {

Member

MichaelClifford Jan 27, 2022

Line #19.    AnomalyLikelihood()

Is this line doing anything?

Reply via ReviewNB

notebooks/example/anomaly_code.ipynb

    
            @@ -0,0 +1,809 @@
          
              {

Member

MichaelClifford Jan 27, 2022

Line #3.    predictor = Predictor(steps=[1, 5], alpha=parameters["predictor"]["sdrc_alpha"])

The parameters defined in this notebook appear to be those selected for the gymdata.csv dataset used in the example hotgym.py is there any work that needs to be done on our part to ensure these parameters are correct for the cpu dataset?

Why are steps 1 and 5 used in the predictor? Why not other values?

Reply via ReviewNB

notebooks/example/anomaly_code.ipynb

    
            @@ -0,0 +1,809 @@
          
              {

Member

MichaelClifford Jan 27, 2022

How do we interpret these outputs?

Reply via ReviewNB

notebooks/example/anomaly_code.ipynb

    
            @@ -0,0 +1,809 @@
          
              {

Member

MichaelClifford Jan 27, 2022

Can you explain a bit better what is being plotted here? What is 1 Step Prediction vs 5 Step? Why is there no 10 Step prediction? What are the Instantaneous and Likelihood anomalies? Are they the same algorithm with different amounts of training time? Or are they different from each other? Why is "Anomaly Likelihood considered to be the best predictor of Anomaly." ? Are there any performance metrics to back this up?

Reply via ReviewNB

notebooks/example/anomaly_code.ipynb

    
            @@ -0,0 +1,809 @@
          
              {

Member

MichaelClifford Jan 27, 2022

Can you space out the sublplots? The titles and x-axis labels are running into each other. You might also want to use seaborn, as it creates slightly nicer graphs.

Reply via ReviewNB

notebooks/example/anomaly_code.ipynb

    
            @@ -0,0 +1,809 @@
          
              {

Member

MichaelClifford Jan 27, 2022

How do we know this method is good? Are there any baselines to compare it with? For example, does HTM significantly outperform just flagging all changes in value greater than 3x standard deviation over a rolling window as an anomaly? Is there anything else we can compare it to to justify the claim "the model does a good job"?

Reply via ReviewNB

MichaelClifford requested review from chauhankaranraj and oindrillac and removed request for durandom, tumido and pacospace

January 27, 2022 20:37

chauhankaranraj reviewed

View reviewed changes

notebooks/example/anomaly_code.ipynb

    
            @@ -0,0 +1,809 @@
          
              {

Member

chauhankaranraj Jan 28, 2022

small typo, anomalpus -> anomalous

Reply via ReviewNB

notebooks/example/anomaly_code.ipynb

    
            @@ -0,0 +1,809 @@
          
              {

Member

chauhankaranraj Jan 28, 2022

Maybe this is a naive question, but is it not possible to read the data with df = pd.read_csv("df_cpu.csv")?

Reply via ReviewNB

notebooks/example/anomaly_code.ipynb

    
            @@ -0,0 +1,809 @@
          
              {

Member

chauhankaranraj Jan 28, 2022

Line #13.    len(records)

Hmm inspecting the csv file on jupyterlab shows 576 rows, but here I see 574 records, any idea what might cause this inconsistency?

Reply via ReviewNB

Member

chauhankaranraj commented Jan 28, 2022

I have included the notebook, dataset and a README file describing the process.

Hey @suppathak, so I see that in addition to notebook, README, and the csv, there's also a bunch of markdown docs and images being added in this PR. Is that intentional or were this supposed to be in another PR?

Collaborator Author

suppathak commented Jan 28, 2022 •

edited

Loading

Hey @suppathak, so I see that in addition to notebook, README, and the csv, there's also a bunch of markdown docs and images being added in this PR. Is that intentional or were this supposed to be in another PR?

Hey @chauhankaranraj , Thanks for the comments. I will work on them. The rest of the markdown docs are from another PR and are already accepted to the master branch. This may be due to some merging error in this branch. However, after doing the rebase, the error is now sorted. Thanks :)

suppathak force-pushed the anomaly-code branch from 4253fa5 to fc73b9a Compare

January 28, 2022 22:47

suppathak changed the title ~~Added HTM anomaly code with respect to Operate First cpu_usage data~~ [WIP] Added HTM anomaly code with respect to Operate First cpu_usage data

sesheta added the do-not-merge/work-in-progress label

suppathak added 2 commits

April 1, 2022 15:07


          Added anomaly code , read me file and sample dataset.

84b991d


          added some changes

f294aed

sesheta commented Apr 1, 2022

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please ask for approval from suppathak after the PR has been reviewed.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

suppathak added 4 commits

April 1, 2022 15:41


          Added anomaly code , read me file and sample dataset.

5aac5b1


          added some changes

308597f


          Added anomaly code , read me file and sample dataset.

c6d98cd


          added some changes

97da8e3

suppathak force-pushed the anomaly-code branch from 2bb43c0 to 97da8e3 Compare

April 1, 2022 16:20

suppathak added 3 commits

April 1, 2022 17:56


          Updated code

08688fb


          Added anomaly code , read me file and sample dataset.

3048cd6


          Merge branch 'anomaly-code' of https://github.com/suppathak/htm-appli…

0e9ea6e

…cations into anomaly-code

sesheta commented Apr 1, 2022

@suppathak: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
aicoe-ci/prow/pre-commit	`0e9ea6e`	link	true	`/test pre-commit`

Full PR test history. Your PR dashboard. Please help us and open an issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/work-in-progress size/XXL