Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Agent not recognizing working directory #6612

Open
1 task done
BradKML opened this issue Feb 5, 2025 · 7 comments
Open
1 task done

[Bug]: Agent not recognizing working directory #6612

BradKML opened this issue Feb 5, 2025 · 7 comments
Labels
troubleshooting/help User requires help

Comments

@BradKML
Copy link

BradKML commented Feb 5, 2025

Is there an existing issue for the same bug?

  • I have checked the existing issues.

Describe the bug and reproduction steps

Logs here (DeepSeek v3 which should be very capable in code/reasoning) https://pastebin.com/9XThHQHh

Conversation led to possible need for repomap #2185 (comment)

OpenHands Installation

Docker command in README

OpenHands Version

0.23

Operating System

WSL on Windows

Logs, Errors, Screenshots, and Additional Context

See above for the full log

@BradKML BradKML added the bug Something isn't working label Feb 5, 2025
@enyst
Copy link
Collaborator

enyst commented Feb 5, 2025

Brad, I read the entire log, and I don't think this is a bug. It's doing this: the LLM gets the working directory automatically, and corrects itself. That's what an agent should do, in the measure of possible, because an LLM will always make mistakes.

Please look at this:

  • the environment told the LLM the working directory MANY TIMES
  • the LLM still made a couple of mistakes
  • the environment still told the LLM the working directory
  • the LLM got it right.

That's not a bug, you know how they say, it's a feature. 😅

Note: the OBSERVATION is part of the context, it's sent to the LLM.

(slightly simplified below to make it more easily readable)

04:35:14 - ACTION
[Agent Controller]
FileReadAction(path='launch_scientist.py', start=0, end=-1, thought='', action='read')

04:35:14 - OBSERVATION
[Agent Controller] ErrorObservation
File not found: /workspace/launch_scientist.py. Your current working directory is /workspace.

==============
[Agent Controller] LEVEL 1 LOCAL STEP 3 GLOBAL STEP 26

04:35:33 - ACTION
[Agent Controller] MessageAction
CONTENT: The file launch_scientist.py was not found in the current working directory. This suggests that the AI-Scientist repo may not have been cloned or set up correctly. I need to verify if the repository has been cloned and if the necessary files are present. I will check the directory structure and look for the repository.

==============
[Agent Controller] LEVEL 1 LOCAL STEP 4 GLOBAL STEP 27

04:35:38 - ACTION
[Agent Controller] CmdRunAction
COMMAND:
ls

04:35:39 - OBSERVATION
[Agent Controller] CmdOutputObservation
metadata={
"exit_code": 0,
"pid": -1,
"username": "root",
"hostname": "5cbac231740b",
"working_dir": "/workspace",
"py_interpreter_path": "/openhands/poetry/openhands-ai-5O4_aCHf-py3.12/bin/python",
"prefix": "",
"suffix": "\n[The command completed with exit code 0.]"
})**

--BEGIN AGENT OBSERVATION--
AI-Scientist
[The command completed with exit code 0.]
[Current working directory: /workspace]
[Python interpreter: /openhands/poetry/openhands-ai-5O4_aCHf-py3.12/bin/python]
[Command finished with exit code 0]
--END AGENT OBSERVATION--

==============
[Agent Controller] LEVEL 1 LOCAL STEP 5 GLOBAL STEP 28

04:35:58 - ACTION
[Agent Controller 3919e77b7fe1460cbe85fc6130e79a73-delegate] MessageAction
CONTENT: The AI-Scientist repository is present in the directory, but the launch_scientist.py file is missing. This indicates that the repository may not have been set up correctly or the file has been moved. I need to verify if the repository has been cloned properly and if the necessary files are present. I will check the contents of the AI-Scientist directory.

==============
[Agent Controller] LEVEL 1 LOCAL STEP 6 GLOBAL STEP 29

04:36:06 - ACTION
[Agent Controller] CmdRunAction
COMMAND:
ls AI-Scientist

04:36:07 - OBSERVATION
[Agent Controller] CmdOutputObservation
metadata={
"exit_code": 0,
"pid": -1,
"username": "root",
"hostname": "5cbac231740b",
"working_dir": "/workspace",
"py_interpreter_path": "/openhands/poetry/openhands-ai-5O4_aCHf-py3.12/bin/python",
"prefix": "",
"suffix": "\n[The command completed with exit code 0.]"
})**

--BEGIN AGENT OBSERVATION--
LICENSE README.md ai-scientist-env ai_scientist data docs example_papers experimental launch_scientist.py requirements.txt review_ai_scientist review_iclr_bench templates
[The command completed with exit code 0.]

[Current working directory: /workspace]
[Python interpreter: /openhands/poetry/openhands-ai-5O4_aCHf-py3.12/bin/python]
[Command finished with exit code 0]
--END AGENT OBSERVATION--

==============
[Agent Controller] LEVEL 1 LOCAL STEP 7 GLOBAL STEP 30

04:36:12 - ACTION
[Agent Controller] FileReadAction(path='AI-Scientist/launch_scientist.py')
04:36:12 - OBSERVATION
[Agent Controller] [Read from /workspace/AI-Scientist/launch_scientist.py is successful.]
import argparse
import json......

@enyst
Copy link
Collaborator

enyst commented Feb 5, 2025

Please note that your logs also include logs from kevin fork. Sorry, we can't help with it, and there are many differences.

In general, it would be helpful to isolate the logs you find problematic, from openhands, it would speed up our responses, and help our ability to look into the issue and see what's wrong.

In this case, I don't think it's a bug, and it's unclear to me what differences in the prompt could help. Give it all the time all contents of the directory, not only the working directory? Maybe, but that's not obviously better, in general. In the general case, it wouldn't be a file inside the first subdirectory, it could be anywhere, at any depth. On large repos a ls -R is more than the context window. We rely on the LLM to find its way to the file, and I would say that with weird bumps, it did succeed in this case. 🤔

@enyst
Copy link
Collaborator

enyst commented Feb 5, 2025

What happens after is a bit strange, on how it gets stuck: the LLM sends a MessageAction, over and over again. That's weird because it's unwarranted. I need to look into it a bit.

A thought: you were using DelegatorAgent. That should work in simple tasks, but really the setup with micro-agents is experimental, it doesn't perform well, and I can only really recommend the same task with CodeAct. Or use the existing Delegator as example and develop some smarter agents. ;)

@BradKML
Copy link
Author

BradKML commented Feb 5, 2025

@enyst I think I started OpenHands after Kevin broke, just to see how well it can handle the stuff in the new version, some of the problem is my fault ngl

Brad, I read the entire log, and I don't think this is a bug. It's doing this: the LLM gets the working directory automatically, and corrects itself. That's what an agent should do, in the measure of possible, because an LLM will always make mistakes... That's not a bug, you know how they say, it's a feature. 😅

This kinda suffers the "Romba hits a wall" type problem that I can't quite put a finger on, where it keeps bumping on the same issue without "turning left", or in this case kept ls and then proceed to think maybe it should download file, and then loop.

Give it all the time all contents of the directory, not only the working directory? Maybe, but that's not obviously better, in general. In the general case, it wouldn't be a file inside the first subdirectory, it could be anywhere, at any depth. On large repos a ls -R is more than the context window. We rely on the LLM to find its way to the file, and I would say that with weird bumps, it did succeed in this case. 🤔

The thing is that it kept sucking into this over and over and can't really break out of it in this case, would love to replicate again just to send the screenshots cus it is the same head-scratching behavior.

A thought: you were using DelegatorAgent. That should work in simple tasks, but really the setup with micro-agents is experimental, it doesn't perform well, and I can only really recommend the same task with CodeAct. Or use the existing Delegator as example and develop some smarter agents. ;)

Yeah that is the problem there, I would need some agent that can handle large steps, and Delegator is generally recommended, and I think @SmartManoj found a way to hack CodeAct to think in stepwise multi-task juggling, but that might as well be magic. Definitely would try with CodeAct and pass the results

I think at the very least we need pwd && ls -la && ls-la .. to be invoked more often, to handle things like multi-folder, and possibly the issue of "already cd into a sub-folder and need to climb out for context". No need to do recursive file listing. Or pray for eza adoption

@enyst
Copy link
Collaborator

enyst commented Feb 5, 2025

Just to clarify, Delegator is not recommended. It's an experimental agent, pretty simple, which hasn't even been evaluated on benchmarks because we are sure it won't make it. (so it doesn't make much sense at this stage)

@mamoodi mamoodi added troubleshooting/help User requires help and removed bug Something isn't working labels Feb 5, 2025
@mamoodi
Copy link
Collaborator

mamoodi commented Feb 5, 2025

I modified the label from a bug based on the comments. Please close the issue if the issue is resolved.

@BradKML
Copy link
Author

BradKML commented Feb 6, 2025

@mamoodi I might need to defer to @enyst observing this specific quirk in behavior (not sure if it counts as a bug)

What happens after is a bit strange, on how it gets stuck: the LLM sends a MessageAction, over and over again. That's weird because it's unwarranted. I need to look into it a bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
troubleshooting/help User requires help
Projects
None yet
Development

No branches or pull requests

3 participants