A Python3 compatible fork of Unsupervised Learning for Lexicon-Based Classification (with a notebook)

This is the (forked 🍴) implementation from the eponymous AAAI 2017 paper.

Please cite:

Eisenstein, Jacob. "Unsupervised learning for lexicon-based classification." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 31. No. 1. 2017.

BibTex:

@inproceedings{eisenstein2017unsupervised,
  title={Unsupervised learning for lexicon-based classification},
  author={Eisenstein, Jacob},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={31},
  number={1},
  year={2017}
}

My notebook (from bayeslex.py -- the main file) is at Unsup_Lexicon.ipynb.

(Following texts are from the original repo's readme.)

It includes:

My code (Prof. Eisenstein code's)
The Bing Liu lexicon (https://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html)
The Cornell sentiment polarity dataset, version 2.0 (https://www.cs.cornell.edu/people/pabo/movie-review-data/)

I (Prof. Eisenstein) cannot promise to provide much support for the code, but I will try.

Reproducing results

To reproduce the Cornell dataset results, you can run the following line:

python bayeslex.py --epochs 250 cornell liu-pos.utf8 liu-neg.utf8 --optimizer admm --prefilter

The second column of the output contains the AUC numbers reported in the paper.

You can get the other datasets in the paper at the following locations:

Stanford/IMDB: http://ai.stanford.edu/~amaas/data/sentiment/
Amazon: https://www.cs.jhu.edu/~mdredze/datasets/sentiment/
CorpusCine: http://www.lsi.us.es/~fermin/index.php/Datasets

The ISOL lexicon was obtained here: http://metashare.upf.edu/repository/browse/isol/f93c81ba65c911e48763000c291ecfc893e66d98695145c0b0066cfa00e6ccda/

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
README.md		README.md
Unsup_Lexicon.ipynb		Unsup_Lexicon.ipynb
admm.py		admm.py
bayeslex.py		bayeslex.py
bayeslex_baselines.py		bayeslex_baselines.py
bayeslex_data.py		bayeslex_data.py
bayeslex_eval.py		bayeslex_eval.py
bayeslex_opt.py		bayeslex_opt.py
bayeslex_stats.py		bayeslex_stats.py
cornell.bow		cornell.bow
cornell.key		cornell.key
data-preprocessing.py		data-preprocessing.py
liu-neg.utf8		liu-neg.utf8
liu-pos.utf8		liu-pos.utf8
logreg.py		logreg.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Python3 compatible fork of Unsupervised Learning for Lexicon-Based Classification (with a notebook)

Reproducing results

About

Releases

Packages

Languages

perathambkk/probabilistic-lexicon-classification

Folders and files

Latest commit

History

Repository files navigation

A Python3 compatible fork of Unsupervised Learning for Lexicon-Based Classification (with a notebook)

Reproducing results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages