- Week1: Data acquisition: Web scraping, Calling Internet APIs
- Week2: Linear Regression: Multivariate linear regression, Polynomial regression, Regularization (Lasso, Ridge), Cross validation, Train-Test split, MAE, MSE
- Week3: Classification 1: Logistic regression, Accuracy, Confusion Matrix, Precision, Recall, F1-score
- Week4: Classification 2: KNN Classifier, Decision Trees
- Week 5: Clustering: K-Means, Hierarchical clustering, Dendrogram
- Week 6: Association Rules: Association rule mining, Apriori algorithm
- Week 7: Recommender systems: User-User Collaborative Filtering (from scratch and using Surprise library), Mean-centered cosine similarity, Precision and Recall at rank k, Precision-recall curve
- Week 8: Text analytics: Text preparation (Tokenization, Lemmatization, Stopwords), Text representation (Bag of Words, TF-IDF), Text structure (Dependency Parsing, Entity recognition), Text similarity (cosine similarity)
- Week 9: Text analytics 2: Text embeddings, Bag of Words, TF-IDF, Word2vec, application to text classification
- Week 10: Neural Networks: Using PyTorch to build NN models. Using existing models from Huggingface.
For the project, you will have to work with Git and GitHub. The following documentation can be useful to you: