For this project you will implement:
Somewhat later I'll post a few datasets. Run each of these algorithms on the datasets, and write a report about how each one does (include accuracy, precision and recall, etc). This report should be written in latex (http://www.latex-project.org/)
For each of these, you will create a python object. One called DecisionTree, one called NaiveBayes, and one called NearestNbr. Each object should have a train function that takes as an argument a list of training vectors (the class will be the first element in each training vector). The only other method these objects are required to have is classify(), that takes a vector as an argument and returns the class.
In all of these cases, your features will be binary only (ones and zeroes), making the branching of your decision tree easy. An example call from train:
myObj.train([ [ 1, 0, 0, 1, 0, 1], [0, 1, 1, 1, 1, 1]])
The first index here is the class, the rest are features. When you call classify, it would look like this:
myObj.classify([0,0,1,0,1])
Which should return 1.
For reference, naive bayes is a simple application of Bayes rule in order to do your classification.
Some example files (the first element in the vector is the class):