Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiment with contextual embeddings based on Transformer architectures. #5

Open
dhonza opened this issue Oct 15, 2020 · 2 comments
Open
Assignees

Comments

@dhonza
Copy link
Member

dhonza commented Oct 15, 2020

Perform initial experiments with the contextual log line embeddings.

Our current embedding is based on aggregating (averaging) of per-token fastText embeddings. Contextual embeddings are expected to improve the performance of the downstream task similarly to NLP.

  • start with pre-trained BERT-like Transformer models (https://huggingface.co/, https://www.sbert.net/, https://simpletransformers.ai/), then:
    • continue with unsupervised pretraining with objectives like masked language modeling (MLM) or next sentence prediction (NSP)
    • finetune on labeled log data
  • analyze the embeddings (clustering, t-SNE visualizations...)
  • add to LAD benchmark suite and compare with other methods
@savchart
Copy link

savchart commented Oct 22, 2020

raw logs -> distilbert -> f-1 score ~ ​​0% (sliding windows)
drain logs -> distilbert -> f-1 score ~ ​​96% (sliding windows and preprocessing)
drain logs -> distilbert -> f-1 score ~ ​​96% (without sliding windows and preprocessing)

@savchart
Copy link

savchart commented Oct 29, 2020

MLM(only content) - eval_loss: 0.7880746566936903 Masked Language Modeling
CLM(only content) - eval_loss: 3.7285281655768676e-07 Next word prediction(Causal Language Modeling)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants