`nnrecommend`

Installation

To install run the following from the root project directory (ideally activate a virtualenv first).

# replace cu111 with the specific cuda version your machine supports
pip install torch==1.8.1+cu111 -f https://download.pytorch.org/whl/torch_stable.html
pip install -q torch-scatter -f https://pytorch-geometric.com/whl/torch-1.8.0+cu111.html
pip install -q torch-sparse -f https://pytorch-geometric.com/whl/torch-1.8.0+cu111.html
pip install -e ./

Once installed make sure you have the python environment scripts directory in your path.

Datasets

movielens movielens dataset from kaggle
podcasts itunes podcaseds dataset from kaggle
spotify spotify skip prediction challenge dataset (this is preprocessed data from the dataset)
spotify-mini spotify skip prediction challenge mini dataset (can load directly the downloaded files)

Models

fm-linear factorization machine with linear embedding
fm-gcn factorization machine with graph embedding
fm-gcn-att factorization machine with graph embedding with attention

Hyper Parameters

max_interactions how many interactions to load from the dataset (-1 for all)
negatives_train how many negative samples to add to the train dataset (-1 for all)
negatives_test how many negative samples to add to the test dataset (-1 for all)
batch_size batch size of the training data loader
epochs amount of epochs to run
embed_dim dimension of the hidden state of the embedding
embed_dropout dropout value for the embedding
learning_rate learning rate of the optimizer
lr_scheduler_factor lr factor for the plateau lr scheduler (1 by default no scheduler)
lr_scheduler_patience amount of fixed epochs for the plateau lr scheduler
lr_scheduler_threshold threshold for the plateau lr scheduler
graph_attention_heads amount of heads in the GCN with attention model
embed_dropout dropout factor for the embedding
pairwise_loss if we should create the training set with pairs of positive-negative interactions (default True)
train_loader_workers amount of workers for the train loader
test_loader_workers amount of workers for the test loader
interaction_context context rows to add, separated by comma (default all adds any context)
recommend enable recommend mode

Supported context values are previous & skip, and they depend on each dataset. Additionally you can set interaction_context:random to test with a random context, this is used to confirm that the factorization machine is correctly implemented and does not improve when adding random context.

Subcommands

Details for the different subcommands are provided later

train train a model on a dataset
fit fit a dataset using a surprise algorithm
tune tune model hyperparameters using ray tune
explore-dataset show information about a dataset
recommend load a trained model to get recommendations

Command Line Interface

Once the package is installed and you have the python bin path in you system path, to see the different available actions and parameters run

nnrecommend --help

Hyperparameters

Passing hyperparameters can be done using --hparam name:value, you can add the argument multiple times to set multiple hyper parameters, or --hparams-path hparams.json to load the parameters from a json dictionary.

The format of the hparams.json file can be a simple dictionary:

{
    "embed_dim": 32,
    "batch_size": 1024
}

or a dictionary with trials if you want to run multiple trainings one after the other

{
    "common": {
        "embed_dim": 32
    },
    "trials": [
        {
            "batch_size": 1024
        },
        {
            "batch_size": 512
        }
    ]
}

Training

This command allows you to train a model.

nnrecommend train --dataset movielens-lab data/ml-dataset-splitted/movielens
nnrecommend train --dataset movielens data/ml-100k/
nnrecommend train --dataset podcasts data/database.sqlite
nnrecommend train --dataset spotify data/spotify.csv

To select the model:

nnrecommend train --dataset spotify data/spotify.csv --model fm-gcn

To create a tensorboard directory:

nnrecommend train --dataset spotify data/spotify.csv --tensorboard tbdir

Then you can run the tensorboard server on that directory

tensorboard --logdir tbdir

Fitting

This command allows to fit an algorith with a dataset and get test values.

nnrecommend fit --dataset spotify data/spotify.csv --algoritm knn --algorithm baseline

This command also supports the tensorboard parameter and will create horizontal lines with the test valies for every algorithm.

Tuning

This command runs hyperparameter tuning with a given dataset and model. We use ray.tune for this task.

nnrecommend tune --dataset spotify data/spotify.csv --model fm-linear --config tune_config.json

The command accepts the tune config in a json file with a dictionary with the keys being the hyperparameter names and the values being the ray.tune methods that describe the possible values. Check the tune documentation for all the possible values.

{
    "learning_rate": ["qloguniform", 1e-4, 1e-1, 5e-4],
    "embed_dropout": ["choice", [0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]],
    "batch_size": ["lograndint", 128, 2048],
    "graph_attention_heads": ["randint", 1, 12]
}

When running you can see the progress by starting a tensorboard server on the ~/ray_results folder.

tensorboard --logdir ~/ray_results

Explore Dataset

This command shows some graphs about the dataset. It shows for every user, item or context pari:

histogram of the counts
spy graph of the adjacency matrix

nnrecommend explore-dataset data/ml-100k --type movielens

Recommend

This command shows recommendations for a given label.

If you store the trained model it can show recommendations for existing users.

nnrecommend --hparam interaction_context: train data/movielens-100k --dataset movielens --output movielens.pth
nnrecommend recommend movielens.pth --label 300 --user-items 3

This will print information about the user 300 and then will find items to recommend them.

Recommend Items

If you train with the recommend hyperparameter enabled, the dataset will be modified so that the model trains to recommend items to new users by:

removing the user column
creating the previous item context
switching the items and previous item context columns

If you store the trained module by passing the --output parameter, you can use the recommend subcommand to get recommendations for new items.

nnrecommend --hparam recommend:1 --hparam interaction_context: train data/movielens-100k --dataset movielens --output movielens_recommend.pth
nnrecommend recommend movielens_recommend.pth --label "star wars"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

USAGE.md

USAGE.md

`nnrecommend`

Installation

Datasets

Models

Hyper Parameters

Subcommands

Command Line Interface

Hyperparameters

Training

Fitting

Tuning

Explore Dataset

Recommend

Recommend Items

Files

USAGE.md

Latest commit

History

USAGE.md

File metadata and controls

nnrecommend

Installation

Datasets

Models

Hyper Parameters

Subcommands

Command Line Interface

Hyperparameters

Training

Fitting

Tuning

Explore Dataset

Recommend

Recommend Items

`nnrecommend`