Given a directed social graph, have to predict missing links to recommend users (Link Prediction in graph)
@author : Akash Kumar
https://www.linkedin.com/in/akash-kumar-9b87b5148/
The code for the benchmarks may be downloaded from github.
There are 5 files:
train.csv contains the directed social graph, represented in a 2-column csv (source_node, destination_node)
test.csv contains a list of nodes to recommend other nodes to in a 1-column csv (source_node)
bfs_benchmark.csv contains a sample submission generated from bfs_benchmark.py
top_k_benchmark.csv contains a sample submission generated from top_k_benchmark.py
random_benchmark.csv contains a sample submission generated from random_benchmark.py
Taken data from facebook's recruting challenge on kaggle data contains two columns source and destination eac edge in graph - Data columns (total 2 columns):
source_node int64
destination_node int64
Type: DiGraph
Number of nodes: 1862220
Number of edges: 9437519
Average in degree: 5.0679
Average out degree: 5.0679
Performance metric for supervised learning:
- Both precision and recall is important so F1 score is good choice
- Confusion matrix
Download the data from the link: https://www.kaggle.com/c/FacebookRecruiting/data
Download the .pynb file and run it.