Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Environment should terminate when adroit hand pen drops the pen #112

Open
jjshoots opened this issue Feb 16, 2023 · 8 comments
Open
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed

Comments

@jjshoots
Copy link
Member

Proposal

In AdroitHandPen, when the agent drops the pen, there is no way to recover, but the environment still does not terminate. The proposal, as in #111, is to enable environment termination on pen drop.

@Kallinteris-Andreas Kallinteris-Andreas added enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed labels Dec 30, 2023
@leonasting
Copy link
Contributor

Hi, I was just following the thread and wanted to check if the condition to check if the pen is dropped needs to be restored which was removed in #111 ?

            # penalty for dropping the pen
            if obj_pos[2] < 0.075:
                reward -= 5
            # removed code
               terminated = True
 

@Kallinteris-Andreas
Copy link
Collaborator

Hey, @leonasting
basically yes, but also needs to be tested
would you be interested on testing it and writing a PR, with a short report that shows terminal frames?

@jjshoots what testing had you done?

Thanks!

@jjshoots
Copy link
Member Author

jjshoots commented Jan 8, 2024

This was awhile ago and I don't quite remember, but if I recall correctly, the AdroitHand environments are no-termination environments, negative rewards are incurred in perpetuity (or until the truncation). So adding a termination signal to HandPen specifically doesn't make sense. At least that's as much discussion on this as I can remember.

@leonasting
Copy link
Contributor

I'm interested in testing. Let me know what tests, you want me to perform. Based on the code and environment, I can infer any agent action after the pen is out of hands is redundant. In the meantime, I will capture few screenshots of terminal frames with the earlier code.

@leonasting
Copy link
Contributor

Initially, the pen has a z-coordinate of 0.25 on the hand and the forehand has a value of 0.2. During experiments involving random movements, the z-coordinate of the pen stays between 0.2 and 0.25 while grasped. If dropped, it falls below 0.2 until hitting the table at around 0.8.
adroit_pen
adroit_pen_2

@Kallinteris-Andreas
Copy link
Collaborator

Kallinteris-Andreas commented Jan 12, 2024

@leonasting
it not is very clear with this camera angle, try camera_id=3 (argument in the make constructor)

@leonasting
Copy link
Contributor

I have attached screenshot of the terminal state.
adroit_pen_3
Another screenshot of the pen out of the hand.
adroit_pen_4

@Kallinteris-Andreas
Copy link
Collaborator

@leonasting excellent I think it is clear that

  1. below 0.8 the pen has fallen, and the hand can not interact with it in any way

Now can you show
2. there is no benefit to keep training after the pen has fallen

A simple ablation study should do it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants