Codebase: Basic implementation of RL algorithms from the Medipixel repo, for the MineRL competition. Uses W&B for logging network parameters (instructions here)
Framework, language: PyTorch 1.3.1, Python 3.7
General Idea: pick {env, algo} pair -> train to solve MineRL competition envs with RL or demonstration-based (IL, fD) algorithms
Standard installation (non-LeNS lab systems):
# create virtual environment (optional)
conda create -n myenv python==3.7
conda activate myenv # Windows
source activate myenv # Linux
git clone https://github.com/prabhasak/MineRL-NeurIPS-2020.git
cd MineRL-NeurIPS-2020
# install required libraries and modules (recommended)
make dep --ignore-errors # ignore certifi error
pip install minerl
run.py
is the entrypoint script for Round 1 (WIP)
python run.py --algo DQfD --cfg-path ./configs/MineRLTreechopVectorObf_v0/dqfd.py --demo-path "./data/minerltreechopvectorobf_disc_64_flat_20.pkl" --seed 42 -conv
Step 1: Enable X11 forwarding
- Requirements for local machine. Verify with
xeyes
command on remote machine - Follow Sapana's lab usage doc and ssh into your lab system
Step 2: Enable GPU usage
- Check
nvidia-smi
,nvcc -v
- Check if CUDA 10.0 is enabled (
python
->import torch
->torch.cuda.is_available()
)
Step 3: (Optional) Enable rendering on headless display
- Install xvfbwrapper with
conda install -c conda-forge xvfbwrapper
- Get display variable with
echo $DISPLAY
. Add this and XAUTHORITY path to .bashrc (instructions here and here) - run your .py with the
xvfb-run
prefix (note: xvfb emulates a display using virtual memory)
Step 4: Set up local repo for running experiments
# create virtual environment (optional)
conda create -n lensminerl --clone root
source activate lensminerl
pip install --upgrade pip
git clone https://github.com/prabhasak/MineRL-NeurIPS-2020.git
cd MineRL-NeurIPS-2020
# install medipixel and mineRL dependencies
make dep --ignore-errors # ignore certifi error
conda install -c anaconda openjdk
pip install tensorboard matplotlib==3.0.3 cloudpickle==1.3.0 tabulate sklearn plotly common
pip install --upgrade minerl
# update jupyter notebook dependencies
pip install --upgrade --force jupyter-console --ignore-installed ipython-genutils
pip install wandb tornado==4.5.3 # without this, messes up a lot of things
pip install --upgrade nbconvert
Step 5: (Optional) Follow steps here to create a W&B account for logging network parameters. Remember to wandb login
and wandb on
to turn on syncing
Step 6: Download the MineRL Dataset. Add MINERL_DATA_ROOT
to your environment variables (Windows) or your bashrc file (Linux). You can also set the variable before running the code as follows:
Windows: SET MINERL_DATA_ROOT=your/local/path
(example: C:\MineRL\medipixel\data)
Linux: export MINERL_DATA_ROOT="your/local/path"
Convert to discrete action space: python run_k_means.py --env 'MineRLTreechopVectorObf-v0' --num-actions 64
Generate expert data: Download the MineRL dataset and change the paths (lines 19-21). Some argument options: \
conv-vec
: Use continuous actions and only vector component of observations (not recommended)conv-full
: Use discrete actions (num-actions
) and both pov, vector of observations 3a.conv-full
andflatten
: Flatten pov and append to vector to make state space
3b.conv-full
andaggregate
: Append vector as fourth channel of pov to make state space (pass through CNN) \
python run_expert_data_format.py -conv-full -flatten --num-actions 64 --traj-use 10 --seed 42
View expert data:
view-npz
andview-pkl
: Available in both .npz and .pkl formats- Use pdb commands to step through the data:
n
to view the next step, andq
to quit the program
python run_expert_data_format.py --view-full --view-pkl -flatten --num-actions 64 --traj-use 10 --seed 42
Competition envs:
- Prefix
xvfb-run
if running on a headless system - To sync wandb logging, remember to
wandb login
andwandb on
. Local logging enabled by default with--log
- Choose if CNN to be used with
-conv
(WIP). Discrete actions enabled by default with--is-discrete
RL: python run_MineRL_TreechopVectorObf_v0.py --env MineRLTreechopVectorObf-v0 --algo Rainbow-DQN --cfg-path ./configs/MineRL_TreechopVectorObf_v0/dqn.py --num-actions 64 --seed 42
fD: python run_MineRL_TreechopVectorObf_v0.py --env MineRLTreechopVectorObf-v0 --algo DQfD --cfg-path ./configs/MineRL_TreechopVectorObf_v0/dqfd.py --demo-path "./data/minerltreechopvectorobf_disc_32_flat_20.pkl" --seed 42
- Basic envs: WIP