Experiment with contextual embeddings based on Transformer architectures. #5

dhonza · 2020-10-15T08:00:37Z

Perform initial experiments with the contextual log line embeddings.

Our current embedding is based on aggregating (averaging) of per-token fastText embeddings. Contextual embeddings are expected to improve the performance of the downstream task similarly to NLP.

start with pre-trained BERT-like Transformer models (https://huggingface.co/, https://www.sbert.net/, https://simpletransformers.ai/), then:
- continue with unsupervised pretraining with objectives like masked language modeling (MLM) or next sentence prediction (NSP)
- finetune on labeled log data
analyze the embeddings (clustering, t-SNE visualizations...)
add to LAD benchmark suite and compare with other methods

savchart · 2020-10-22T11:28:40Z

raw logs -> distilbert -> f-1 score ~ 0% (sliding windows)
drain logs -> distilbert -> f-1 score ~ 96% (sliding windows and preprocessing)
drain logs -> distilbert -> f-1 score ~ 96% (without sliding windows and preprocessing)

savchart · 2020-10-29T15:24:59Z

MLM(only content) - eval_loss: 0.7880746566936903 Masked Language Modeling
CLM(only content) - eval_loss: 3.7285281655768676e-07 Next word prediction(Causal Language Modeling)

dhonza assigned prokopCerny, vankovap and savchart Oct 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experiment with contextual embeddings based on Transformer architectures. #5

Experiment with contextual embeddings based on Transformer architectures. #5

dhonza commented Oct 15, 2020 •

edited

Loading

savchart commented Oct 22, 2020 •

edited

Loading

savchart commented Oct 29, 2020 •

edited

Loading

Experiment with contextual embeddings based on Transformer architectures. #5

Experiment with contextual embeddings based on Transformer architectures. #5

Comments

dhonza commented Oct 15, 2020 • edited Loading

savchart commented Oct 22, 2020 • edited Loading

savchart commented Oct 29, 2020 • edited Loading

dhonza commented Oct 15, 2020 •

edited

Loading

savchart commented Oct 22, 2020 •

edited

Loading

savchart commented Oct 29, 2020 •

edited

Loading