Skip to content

Latest commit

 

History

History
9 lines (6 loc) · 1.11 KB

File metadata and controls

9 lines (6 loc) · 1.11 KB

Data Mining on Anti-Churn Dataset

The goal of the project consists in extracting information to create value for support decisions. The approach of data mining is used, that is a set of techniques and methodologies aimed at extracting useful information from large amounts of data (e.g., databases, data warehouses, etc.), through automated or semi-automated methods (e.g., machine learning) and the scientific, business, industrial, or operational use of the same.

The problem analyzed was studied in the context of binary classification and addressed through techniques of feature selection, formal concept analysis, cross validation, undersampling and oversampling. By doing so, problems related to the dimensionality of the data and unbalanced classes could be solved. Models such as Random Forest, Logistic Regression, Multi-Layer Perceptron and Gaussian Naive Bayes were implemented; performance was evaluated in terms of Accuracy, Precision, Recall, F-measure and AUC. Finally, the quantile method was also applied as a further evaluation of the best performing model.

Due to privacy issues, the dataset cannot be published.