LearningNLP

Some Tutorials and in depth analysis of NLP's techniques / algorithms

Tutorial 1

Dataset: ArXiv from Kaggle
Preprocessing: pandas, nltk, gensim
Binary classification: Scikit-learn's CountVectorizer + TfidfTransformer
Explainability Methods: LIME, SHAP

Useful references for explainibility methods:
- LIME --> Why Should I Trust You?": Explaining the Predictions of Any Classifier
- SHAP --> A Unified Approach to Interpreting Model Predictions
- Adversarial attacks (have you heard of?), i.e. how to fool algorithms --> Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods
Open Questions for you:
- How to deal with multiclass problems?
- Try to develop binary classification with abstracts instead of titles
- Try to develop the same pipeline with spaCy