Features tf-idf: text feature extraction def: scikit / tf-idf Vectorization and Features in sci-kit (ipynb) Regression Linear Regression Models & Methods cosine similarity for vector space models Bag of Words (used in Bayesian Spam Filtering, document term frequency models) Classification An introduction to Classification / slides Statistical Definitions Accuracy & Precision (relevant to classification, measuring efficacy of learned models)