Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

verify openvla in robosuite #646

Open
shouji000 opened this issue Mar 4, 2025 · 0 comments
Open

verify openvla in robosuite #646

shouji000 opened this issue Mar 4, 2025 · 0 comments

Comments

@shouji000
Copy link

System Info

robosuite 1.15.1
Ubuntu22.04
Jetson Agx orin(64G)
conda environment(python 3.10)
I want to verify openvla in robosuite

Information

I wirte a simple code

import robosuite as suite
from robosuite.controllers import load_part_controller_config
from transformers import AutoModelForVision2Seq, AutoProcessor
from PIL import Image
import torch

local_model_path = "/home/yljy/jetson-containers/data/models/huggingface/models--openvla--openvla-7b/snapshots/31f090d05236101ebfc381b61c674dd4746d4ce0"

processor = AutoProcessor.from_pretrained(local_model_path, trust_remote_code=True)
vla = AutoModelForVision2Seq.from_pretrained(
    local_model_path,
    attn_implementation="flash_attention_2",
    torch_dtype=torch.bfloat16,
    low_cpu_mem_usage=True,
    trust_remote_code=True
).to("cuda:0")

controller_config = load_part_controller_config(default_controller="IK_POSE")

robosuite_env = suite.make(
    "Lift",
    robots="Panda",
    has_renderer=True,
    has_offscreen_renderer=True,
    use_camera_obs=True,
    camera_names="frontview",
    camera_heights=640,
    camera_widths=480
)

obs = robosuite_env.reset()

prompt = "In: What action should the robot take to pick up the cube?\nOut:"

while(True):
    image = Image.fromarray(obs['frontview_image'])

    inputs = processor(prompt, image).to("cuda:0", dtype=torch.bfloat16)
    action = vla.predict_action(**inputs, unnorm_key="bridge_orig", do_sample=False)
    action[0:3] = action[0:3]*100 #because the sensitivity is 0.01
    action[6] = action[6]*2-1 #related to the gripper openvla is [0,1],robosuite is [-1,1]

    print(action)

    obs, reward, done, info = robosuite_env.step(action)
    robosuite_env.render()

Reproduction

I run the code to use the openvla to control the manipulator, but something goes wrong. By rights, the robotic arm should be moving down, but the action[2] that openvla outputs (i.e., the offset of the Z-axis) is always positive. Is it because the end coordinates in robosuite don't match the end coordinates in openvla?

Image

Expected behavior

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant