This contains an annotated bibliography and a literature review examining the robustness of BERT for text classification tasks and how to improve it. This stuff is pretty cool to me and I am still slowly learning, so maybe these resources can help you out with your own NLP text classification tasks!
YouTube overview here: https://www.youtube.com/watch?v=BhAR-IW2B4E
-
BERT vs. ML: ("Comparing BERT against traditional machine learning text classification"(Gonzales et al.)).
-
BERT vs. ML for small datasets: ("Low-Shot Classification: A Comparison of Classical and Deep Transfer Machine Learning Approaches"(Usherwood et al.)).
-
BERT for drug reviews: ("Comparing deep learning architectures for sentiment analysis on drug reviews"(Colon et al.)).
-
BERT for Alzheimer's Disease Detection: ("To BERT or Not To BERT: Comparing Speech and Language-based Approaches for Alzheimer's Disease Detection"(Balagopalan et al.)).
-
BERT in other cultures: (Antisocial Online Behavior Detection Using Deep Learning"(Zinovyeva et al.)).
-
BERT for radiological classification: ("The Utility of General Domain Transfer Learning for Medical Language Tasks"(Ranti et al.)).
-
TextFooler: Rule-based Adversarial Attacks ("Is Bert Really Robust?" (Jin et al., 2019)).
-
BAE: BERT masked language model turned against itself ("BAE: BERT-based Adversarial Examples for Text Classification" (Garg & Ramakrishnan, 2019)).
-
Bert-Attack: BERT masked language model transformation with subword replacement strategy ("BERT-ATTACK: Adversarial Attack Against BERT Using BERT" (Li et al., 2020)).
-
Examining underneath BERT's hood: ("How to Fine-Tune BERT for Text Classification?"(Sun et al.)).
-
BERT for clinincal data: ("Publicly Available Clinical BERT Embeddings"(Alsentzer et al.)).
-
BERT vs. Albert: ("ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"(Zhenzhong et al.)).