Skip to content

Latest commit

 

History

History
61 lines (49 loc) · 2.52 KB

README.md

File metadata and controls

61 lines (49 loc) · 2.52 KB

recommendation_system

Xccelerate Data Science Bootcamp Collaborative Project: 4 flavours of recommendation systems using the Booking Crossing Dataset which is also included here in this repo.

See the project's details here

made-with-python python versions MIT license

How to Use this repo

  1. Clone this repo:
$ git clone https://github.com/ohjho/recommendation_system.git
$ cd recommendation_system
  1. install the requirements. We highly recommend doing this inside a virtualenv and avoid dependency hell.
#---------------- optional ------------------
$ mkvirtualenv --python=`which python3` NameOfYourEnv
$ workon NameOfYourEnv
#--------------------------------------------

(NameOfYourEnv) $ pip install -r requirements.txt

and just check and resolve any packages dependency issues if they show up under pip check. It should say No broken requirements found.

  1. Start Jupyter notebook
$ jupyter notebook

Data Cleaning

How to use data_cleaning.py

The script data_cleaning.py will import the datasets and clean the data.

To get 3 separate dataframes, do this

from data_cleaning import get_clean_data
df_books, df_users, df_ratings = get_clean_data()

And if the csv files are not under data/, use the path argument.

To get one merged dataframe, do this:

from data_cleaning import get_merged_data_frame
df_merged = get_merged_data_frame(user_argv=user_threshold, isbn_argv=book_threshold)

where user_threshold is the threshold to filter out users with fewer than this number of books rated. books_threshold is the books counterpart And if the csv files are not under "/data/", use the path argument.

Modeling

Presentation

Google Slides