Is there some limitation with the dimensions of actions and observations? #27

paapu88 · 2023-12-21T08:16:29Z

Dear Developers,
I'm getting the following error when running the code below

pearl/neural_networks/common/value_networks.py", line 262, in get_q_values
x = torch.cat([state_batch, action_batch], dim=-1)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Tensors must have same number of dimensions: got 4 and 2

Am I doing something stupid, or is there some limitation (for instance, so that dimension of the action and observation space must be the same?)
Terveisin, Markus

""" 
copy pasted from 
https://github.com/facebookresearch/Pearl?tab=readme-ov-file#quick-start

with small modifications for training,


"""


from pearl.pearl_agent import PearlAgent
from pearl.action_representation_modules.one_hot_action_representation_module import (
    OneHotActionTensorRepresentationModule,
)
from pearl.policy_learners.sequential_decision_making.deep_q_learning import (
    DeepQLearning,
)
from pearl.replay_buffers.sequential_decision_making.fifo_off_policy_replay_buffer import (
    FIFOOffPolicyReplayBuffer,
)
from pearl.utils.instantiations.environments.gym_environment import GymEnvironment
from pearl.action_representation_modules.identity_action_representation_module import (
    IdentityActionRepresentationModule,
)
from pearl.utils.functional_utils.train_and_eval.online_learning import online_learning

from time import sleep
import gym
from tqdm import tqdm
import torch
import matplotlib.pyplot as plt
import numpy as np

# env = GymEnvironment("highway-v0", render_mode="human")

# env = GymEnvironment("CartPole-v1", render_mode="human")
env = GymEnvironment("CarRacing-v2", render_mode="human", continuous=False)
observation, action_space = env.reset()
print(f"observation")
print(observation)
print(f"action_space")
attributes = dir(action_space)
print(attributes)
print(f"action dim: {action_space.action_dim}")
# print(f"actions: {action_space.actions}")

# sys.exit()

agent = PearlAgent(
    policy_learner=DeepQLearning(
        state_dim=9216,
        action_space=action_space,
        hidden_dims=[64, 64],
        training_rounds=20,
        action_representation_module=OneHotActionTensorRepresentationModule(
            max_number_actions=5
        ),
    ),
    replay_buffer=FIFOOffPolicyReplayBuffer(10_000),
)

# experiment code
number_of_steps = 10000
record_period = 1000

info = online_learning(
    agent=agent,
    env=env,
    number_of_steps=number_of_steps,
    print_every_x_steps=1000,
    record_period=record_period,
    learn_after_episode=True,
)
torch.save(info["return"], "CarRacing-DQN-return.pt")
plt.plot(record_period * np.arange(len(info["return"])), info["return"], label="DQN")
plt.legend()
plt.show()

rodrigodesalvobraz · 2023-12-22T01:55:04Z

I'm looking into this and will get back to you.

BillMatrix · 2024-01-10T20:20:06Z

@paapu88 is your observation space an image or a video with your environment?

paapu88 · 2024-01-12T07:47:01Z

@BillMatrix see https://www.gymlibrary.dev/environments/box2d/car_racing/#

jb3618columbia · 2024-01-12T18:14:03Z

I think the error is because you are using a VanillaQValueNetwork which requires the state and the action to have the same dimension. For image inputs, you want to use the CNNQValueNetwork as the network type (we need to enable that for deep q learning).

rodrigodesalvobraz · 2024-01-19T18:43:38Z

We are going to implement a fix.

rodrigodesalvobraz self-assigned this Jan 12, 2024

rodrigodesalvobraz assigned jb3618columbia Jan 19, 2024

rodrigodesalvobraz removed their assignment Jan 29, 2024

rodrigodesalvobraz added the enhancement New feature or request label Mar 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there some limitation with the dimensions of actions and observations? #27

Is there some limitation with the dimensions of actions and observations? #27

paapu88 commented Dec 21, 2023

rodrigodesalvobraz commented Dec 22, 2023

BillMatrix commented Jan 10, 2024

paapu88 commented Jan 12, 2024

jb3618columbia commented Jan 12, 2024

rodrigodesalvobraz commented Jan 19, 2024

Is there some limitation with the dimensions of actions and observations? #27

Is there some limitation with the dimensions of actions and observations? #27

Comments

paapu88 commented Dec 21, 2023

rodrigodesalvobraz commented Dec 22, 2023

BillMatrix commented Jan 10, 2024

paapu88 commented Jan 12, 2024

jb3618columbia commented Jan 12, 2024

rodrigodesalvobraz commented Jan 19, 2024