Add support for two new datasets: Thunderbird and OpenStack. #12

dhonza · 2020-12-10T13:22:08Z

Add Support for Thunderbird and OpenStack from here: https://github.com/logpai/loghub

Thunderbird
OpenStack

dhonza · 2021-05-04T13:20:37Z

The problem with all but the HDFS dataset is that these are labeled in a per-log line fashion, while the annotation is typically based on simplistic logline features such as detecting the phrases "ERROR" or "WARNING". We found the problem with the BGL data which makes that dataset practically unusable for our purposes. Any method working with the actual semantics of loglines, such as our methods based on global or contextual embeddings can detect similar phrases easily.

Algorithms based on template extraction (e.g., as https://github.com/logpai/loglizer) have a much harder time on these datasets. Nevertheless, we are more interested in detecting "harder" anomalies which are represented by more complex interplay of log messages.

dhonza self-assigned this Dec 10, 2020

dhonza assigned savchart Dec 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for two new datasets: Thunderbird and OpenStack. #12

Add support for two new datasets: Thunderbird and OpenStack. #12

dhonza commented Dec 10, 2020 •

edited

Loading

dhonza commented May 4, 2021

Add support for two new datasets: Thunderbird and OpenStack. #12

Add support for two new datasets: Thunderbird and OpenStack. #12

Comments

dhonza commented Dec 10, 2020 • edited Loading

dhonza commented May 4, 2021

dhonza commented Dec 10, 2020 •

edited

Loading