Data Mining on Anti-Churn Dataset

The goal of the project consists in extracting information to create value for support decisions. The approach of data mining is used, that is a set of techniques and methodologies aimed at extracting useful information from large amounts of data (e.g., databases, data warehouses, etc.), through automated or semi-automated methods (e.g., machine learning) and the scientific, business, industrial, or operational use of the same.

The problem analyzed was studied in the context of binary classification and addressed through techniques of feature selection, formal concept analysis, cross validation, undersampling and oversampling. By doing so, problems related to the dimensionality of the data and unbalanced classes could be solved. Models such as Random Forest, Logistic Regression, Multi-Layer Perceptron and Gaussian Naive Bayes were implemented; performance was evaluated in terms of Accuracy, Precision, Recall, F-measure and AUC. Finally, the quantile method was also applied as a further evaluation of the best performing model.

Due to privacy issues, the dataset cannot be published.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Data Mining on Anti-Churn Dataset

Files

README.md

Latest commit

History

README.md

File metadata and controls

Data Mining on Anti-Churn Dataset