-
Notifications
You must be signed in to change notification settings - Fork 869
NAB Contributions Criteria
NAB is intended for the research community and we encourage your contributions and feedback!
We're continuously adding data to NAB (in batches with the release of a new version) and welcome data you're willing to contribute. Specifically we're looking for data meeting the following criteria:
- real-world time-series data
-
1000 records
- labeled anomalies
For us to consider adding your algorithm to the NAB repo it must meet the following criteria:
- open-source
- work with streaming data (i.e. process data in real-time)
- we must be able to fully-replicate your results
For an algorithm to be used in practice it must run online as data is streaming in, and not in batch. It is necessary the algorithms are computationally efficient to process streaming data, i.e O(N). The following algorithms have been tested on NAB and do not meet this criteria:
-
Lytics Anomalyzer
- Runs in O(N^2) because for each subsequent record the model retrains over all previous records.
- The author recommended using the detector within a moving window (250 records) to speed up the algorithm, yielding the following results: 4.42 on the standard profile, 2.39 for rewarding low FP, and 8.58 for rewarding low FN. However this still ran quite slow; e.g. running Anomalyzer on "realKnownCause/machine_temperature_system_failure.csv" took 52m0s, but only 4m39s for the HTM detector.
Want to suggest some changes to the NAB codebase? Submit an issue and/or pull request and we'll take a look.
Email us: [email protected]