Skip to content

Commit

Permalink
Merge pull request #22 from anyscale/recsys-paco
Browse files Browse the repository at this point in the history
recsys updates
  • Loading branch information
deanwampler authored Sep 3, 2020
2 parents 35bcdce + 4c76ca3 commit fb6eb6a
Show file tree
Hide file tree
Showing 4 changed files with 48,516 additions and 19 deletions.
93 changes: 76 additions & 17 deletions ray-rllib/recsys/01-Recsys.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The full source code for this example recommender system is also in `recsys.py`. You can run it with the default settings, e.g., to exercise the code, use the following command:\n",
"For reference, the GitHub public repo for this code is available at <https://github.com/anyscale/academy/blob/master/ray-rllib/recsys> and full source code for this example recommender system is also in the `recsys.py` script. You can run that with default settings to exercise the code:\n",
"\n",
"```shell\n",
"python recsys.py\n",
Expand Down Expand Up @@ -108,9 +108,10 @@
"outputs": [],
"source": [
"from pathlib import Path\n",
"import os\n",
"import pandas as pd\n",
"\n",
"DATA_PATH = Path(\"jester-data-1.csv\")\n",
"DATA_PATH = Path(os.getcwd()) / Path(\"jester-data-1.csv\")\n",
"sample = load_data(DATA_PATH)\n",
"\n",
"df = pd.DataFrame(sample)\n",
Expand Down Expand Up @@ -212,12 +213,12 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"That plot shows a \"knee\" in the curve at `k=12` where the decrease in error begins to level out. That's a reasonable number of clusters, such that each cluster will tend to have ~8% of the items. That choice has an inherent trade-off:\n",
"This kind of cluster analysis has stochastic aspects, so results may differ on different runs. Generally, the plot shows a \"knee\" in the curve near `k=7` as the decrease in error begins to level out. That's a reasonable number of clusters, such that each cluster will tend to have ~14% of the items. That choice has an inherent trade-off:\n",
"\n",
" * too few clusters → poor predictions (less accuracy)\n",
" * too many clusters → poor predictive power (less recall)\n",
"\n",
"Now we can run K-means in `scikit-learn` with that hyperparameter `k=12` to get the clusters that we'll use in our RL environment:"
"Now we can run K-means in `scikit-learn` with that hyperparameter `k=7` to get the clusters that we'll use in our RL environment:"
]
},
{
Expand All @@ -226,7 +227,7 @@
"metadata": {},
"outputs": [],
"source": [
"K_CLUSTERS = 12\n",
"K_CLUSTERS = 7\n",
"\n",
"km = KMeans(n_clusters=K_CLUSTERS)\n",
"km.fit(X)\n",
Expand Down Expand Up @@ -279,7 +280,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"BTW, let's take a look at the top three clusters from this analysis…"
"BTW, let's take a look at the K clusters from this analysis…"
]
},
{
Expand All @@ -290,31 +291,59 @@
"source": [
"plt.scatter(\n",
" X[y_km == 0, 0], X[y_km == 0, 1],\n",
" s=100, c=\"lightgreen\",\n",
" s=50, c=\"lightgreen\",\n",
" marker=\"s\", edgecolor=\"black\",\n",
" label=\"cluster 1\"\n",
" label=\"cluster 0\"\n",
")\n",
"\n",
"plt.scatter(\n",
" X[y_km == 1, 0], X[y_km == 1, 1],\n",
" s=100, c=\"orange\",\n",
" s=50, c=\"orange\",\n",
" marker=\"o\", edgecolor=\"black\",\n",
" label=\"cluster 2\"\n",
" label=\"cluster 1\"\n",
")\n",
"\n",
"plt.scatter(\n",
" X[y_km == 2, 0], X[y_km == 2, 1],\n",
" s=100, c=\"lightblue\",\n",
" s=50, c=\"lightblue\",\n",
" marker=\"v\", edgecolor=\"black\",\n",
" label=\"cluster 2\"\n",
")\n",
"\n",
"plt.scatter(\n",
" X[y_km == 3, 0], X[y_km == 3, 1],\n",
" s=50, c=\"blue\",\n",
" marker=\"^\", edgecolor=\"black\",\n",
" label=\"cluster 3\"\n",
")\n",
"\n",
"plt.scatter(\n",
" X[y_km == 4, 0], X[y_km == 4, 1],\n",
" s=50, c=\"yellow\",\n",
" marker=\"<\", edgecolor=\"black\",\n",
" label=\"cluster 4\"\n",
")\n",
"\n",
"plt.scatter(\n",
" X[y_km == 5, 0], X[y_km == 5, 1],\n",
" s=50, c=\"purple\",\n",
" marker=\">\", edgecolor=\"black\",\n",
" label=\"cluster 5\"\n",
")\n",
"\n",
"plt.scatter(\n",
" X[y_km == 6, 0], X[y_km == 6, 1],\n",
" s=50, c=\"brown\",\n",
" marker=\"X\", edgecolor=\"black\",\n",
" label=\"cluster 6\"\n",
")\n",
"\n",
"# plot the centroids\n",
"plt.scatter(\n",
" km.cluster_centers_[:, 0], km.cluster_centers_[:, 1],\n",
" s=250, marker=\"*\",\n",
" c=\"red\", edgecolor=\"black\",\n",
" label=\"centroids\"\n",
" label=\"centers\"\n",
")\n",
"\n",
"plt.legend(scatterpoints=1)\n",
Expand All @@ -326,7 +355,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Not bad. Those clusters show some separation, at least along those three most dominant dimensions."
"Not bad, based on the centers these clusters show some separationat least when we plot in 2 dimensions."
]
},
{
Expand Down Expand Up @@ -830,7 +859,7 @@
"metadata": {},
"outputs": [],
"source": [
"TRAIN_ITER = 30\n",
"TRAIN_ITER = 20\n",
"\n",
"df = pd.DataFrame(columns=[ \"min_reward\", \"avg_reward\", \"max_reward\", \"steps\", \"checkpoint\"])\n",
"status = \"reward {:6.2f} {:6.2f} {:6.2f} len {:4.2f} saved {}\"\n",
Expand Down Expand Up @@ -940,7 +969,7 @@
"AGENT.restore(BEST_CHECKPOINT)\n",
"history = []\n",
"\n",
"for episode_reward in run_rollout(AGENT, env, n_iter=100, verbose=False):\n",
"for episode_reward in run_rollout(AGENT, env, n_iter=500, verbose=False):\n",
" history.append(episode_reward)\n",
" \n",
"print(\"average reward:\", round(sum(history) / len(history), 3))"
Expand Down Expand Up @@ -984,7 +1013,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Evaluate Learning with TensorBoard\n",
"## Evaluate learning with TensorBoard\n",
"\n",
"You also can run [TensorBoard](https://www.tensorflow.org/tensorboard) to visualize the RL training metrics from the log files. The results during training were written to a directory under `$HOME/ray_results`\n",
"\n",
Expand All @@ -1003,6 +1032,36 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Exercises\n",
"\n",
"For the exercises, there are several ways to modify the Gym environment or the RLlib training parameters, then compare how the outcomes differ:\n",
"\n",
" 1. Re-run using smaller and larger K values\n",
" 2. Adjust the rewards for depleted and unrated actions\n",
" 3. Increase the number of training iterations\n",
" 4. Compare use of the other dataset partitions during rollout: `\"jester-data-2.csv\"` or `\"jester-data-3.csv\"`\n",
"\n",
"For each of these variations compare:\n",
"\n",
" * baseline with random actions \n",
" * baseline with the naïve strategy\n",
" * predicted average reward from training\n",
" * stats from the rollout\n",
"\n",
"Let's discuss the results as a group.\n",
"\n",
"Other questions to discuss:\n",
"\n",
" 1. In what ways could the \"warm start\" be improved?\n",
" 2. How could this code be modified to scale to millions of users? Or to thousands of items?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Clean up\n",
"\n",
"Finally, let's shutdown Ray gracefully:"
]
},
Expand Down Expand Up @@ -1032,7 +1091,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.7"
"version": "3.7.4"
}
},
"nbformat": 4,
Expand Down
Loading

0 comments on commit fb6eb6a

Please sign in to comment.