Skip to content

NAB Contributions Criteria

Alexander Lavin edited this page Oct 15, 2015 · 10 revisions

NAB Contributions Criteria

NAB is intended for the research community and we encourage your contributions and feedback!

Data

We're continuously adding data to NAB (in batches with the release of a new version) and welcome data you're willing to contribute. Specifically we're looking for data meeting the following criteria:

  • real-world time-series data
  • 1000 records

  • labeled anomalies

Anomaly detection algorithms

For us to consider adding your algorithm to the NAB repo it must meet the following criteria:

  • open-source
  • work with streaming data (i.e. process data in real-time)
  • we must be able to fully-replicate your results

For an algorithm to be used in practice it must run online as data is streaming in, and not in batch. It is necessary the algorithms are computationally efficient to process streaming data, i.e O(N). The following algorithms have been tested on NAB and do not meet this criteria:

  • Lytics Anomalyzer
    • Runs in O(N^2) because for each subsequent record the model retrains over all previous records.
    • The author recommended using the detector within a moving window (250 records) to speed up the algorithm, yielding the following results: 4.42 on the standard profile, 2.39 for rewarding low FP, and 8.58 for rewarding low FN. However this still ran quite slow; e.g. running Anomalyzer on "realKnownCause/machine_temperature_system_failure.csv" took 52m0s, but only 4m39s for the HTM detector.

Code

Want to suggest some changes to the NAB codebase? Submit an issue and/or pull request and we'll take a look.

Comments/suggestions

Email us: [email protected]

Clone this wiki locally