The domain which we chose for our SDL mini-project is Data Analysis. Data analysis is defined as a process of cleaning, transforming, and modeling data to discover useful information for business decision-making. The purpose of Data Analysis is to extract useful information from data and make the decision based upon the data analysis.
Sales forecasting is an important aspect of different companies engaged in retailing, logistics, manufacturing, marketing and wholesaling. It allows companies to efficiently allocate resources, to estimate achievable sales revenue and to plan a better strategy for future growth of the company. In this paper, prediction of sales of a product from a particular outlet is performed via a two-level approach that produces better predictive performance compared to any of the popular single model predictive learning algorithms. The approach is performed on Big Mart Sales data of the year 2013. Data exploration, data transformation and feature engineering play a vital role in predicting accurate results. The result demonstrated that the two-level statistical approach performed better than a single model approach as the former provided more information that leads to better prediction. In this project, we are analyzing a real world dataset, and to explore how machine learning algorithms can be used to find the patterns in data. We were expected to gain experience using common machine learning libraries, and predict some sales regarding taken dataset. After performing the required tasks on a dataset of our choice, herein lies our final project report.
- Mayank Kumar
- Apoorva Verma
- Deepanshu Singh
- Abhishek Pandey
Dataset at kaggle’s official site
Study of machine learning and basics.
Sales forecasting using machine learning
Sales prediction using simple regression model
Sales forecasting using random forest machine learning algorithm