Classifiers

Caveat Emptor! There are some embarrassing errors that I need to fix, however, I think a look at the punch card will provide a complete explanation for their existance.

Three classifiers:

Bayesian classifier based on maximum-likelihood estimation
Bayesian classifer based on Parzen window estimation
Basic k-nearest neighbor rule

Included in this repo are the three different data sets that were used for testing. (corresponding credits and information are in their respective ./bin/*_readme.txt files):

Iris
UCI Wine
Handwritten Digits

Usage

Simplified by using the run.py script, e.g.: python3 run.py classifier_name path_to_training_data [-h] [-t PATH_TO_TESTING_DATA] [-c PATH_TO_CLASSIFICATION_DATA] [-v]

The possible values for classifier_name are:

mle for the MLE Bayesian classifier.
parzen (or p) for the Parzen window Bayesian classifer.
knn for the k-nearest neighbors classifier.

So, if you wanted to use the maximum likelihood classifer on the iris data set, then the command would be python3 run.py mle ./bin/iris_training.txt -t ./bin/iris_test.txt.

The training and testing files should have each instance on a separate line, with components separated by spaces. Per the following example:

class_number x0 x1 x2 x3
class_number x0 x1 x2 x3

For the classification data, the file should not include the class_number (e.g. each instance is separated by a new line).

Customization

To customize use of the classf module (e.g. make a custom run.py script), the module has a run command that can help, and following example demonstrates usage:

def myCustomFileParser(filepath):
    # parse file into a numpy array, with each instance a row in this array.
    # e.g. data[0] corresponds to the first instance's feature vector.

import classf
classifier = 'parzen'
training_data = './training.txt'
testing_data = './testing.txt'
verbose = True

classifier = classf.run(classifier, training_data, testing_data, verbose, myCustomFileParser)

###References, etc

Sebastian Raschka's resources were a huge help in understanding.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Classifiers

Usage

Customization

Files

README.md

Latest commit

History

README.md

File metadata and controls

Classifiers

Usage

Customization