Skip to content

Classifying transactions

Christian Egli edited this page Sep 6, 2019 · 2 revisions

We want to classify new transactions and assign them to an account. For that we want to use a Naive Bayes classifier. Since we mostly deal with very short descriptions of a translation it seems to be best to use Bernoulli naive Bayes, as explained in Parameter estimation and event models.

Spam Filtering with Naive Bayes - Which Naive Bayes? indicates that Multinomial Naive Bayes with Boolean attributes is best. It is a variation of Multinomial Naive Bayes, using token frequency attributes

Data Science from Scratch, 2nd Edition has a very nice chapter on Naive Bayes with a simple implementation (in Python). It seems to use Multinomial naive Bayes.

Tom Szilagyi has a very interesting implementation of Naive Bayes in his banks2ledger. There’s a detailed explanation in Payment matching done right. Not sure which version of Naive Bayes he implements.

lambda-ml has a a simple and very concise implementation of Gaussian naive Bayes.

Robert M. Johnson has a series of very thorough articles about Naive Bayes Classifier and in particular the Bernoulli Naive Bayes Classifier (with code) and the Multinomial Naive Bayes Classifier.

Maybe the best introduction to Naive Bayes can be found in the chapter Text classification & Naive Bayes of the book Introduction to Information Retrieval by Cambridge University Press.

Clone this wiki locally