Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agent will get stuck after a few iterations if used in a loop #582

Open
Bilokin opened this issue Feb 10, 2025 · 5 comments
Open

Agent will get stuck after a few iterations if used in a loop #582

Bilokin opened this issue Feb 10, 2025 · 5 comments
Labels
bug Something isn't working

Comments

@Bilokin
Copy link
Contributor

Bilokin commented Feb 10, 2025

Hello,

I am trying to run an agent in a for loop, and after just a few iterations it gets stuck for no reason if I use TransformersModel with CodeAgent running on a GPU.
I observe this behavior on Linux cloud and on my private Windows PC for any version of smolagents.

Minimal code to reproduce the error

from smolagents import TransformersModel, CodeAgent, __version__
print(__version__)
model = TransformersModel("meta-llama/Llama-3.2-1B-Instruct", torch_dtype='auto', device_map='auto', max_new_tokens=32000)
agent = CodeAgent(tools=[], model=model, additional_authorized_imports=['numpy'])
prompt = "What is the value of $({i}+{i})^-{i}$? Write a valid python code block after ```"
for i in range(50, 101):
    agent.run(prompt.format(i=i))

The LLama 1B model fits nicely on my 8GB GPU and there is a lot of free VRAM left. After a few failed steps of some iteration the agent will get stuck on response generation stage. On Windows PC I see that the GPU memory controller load increases dramatically with time while no output is generated, so perhaps this is a VRAM related issue.

I can run the same LLM without smolagents using transformers directly in a simple loop and it successfully finishes all iterations.
I can also run LiteLLMModel model with smolagents that is connected to Ollama server on my PC, and it is running fine.

Could you please take a look at it?

Packages versions:
transformers 4.47.0
smolagents 1.8.0 or main

@Bilokin Bilokin added the bug Something isn't working label Feb 10, 2025
@sysradium
Copy link
Contributor

I have a feeling it is because it does not get the final answer. So the problem is somewhere here:

while final_answer is None and self.step_number <= self.max_steps:

Check the output, see if it returns final_result = ... as stated in this prompt:

final_answer("YOUR FINAL ANSWER HERE")

THe model you have is very small, it might not be doing thiat.

@g-eoj
Copy link
Contributor

g-eoj commented Feb 10, 2025

Will it use up all tokens eventually and return due to max_new_tokens=32000 or is it a full on hang? I have encountered a similar issue where the model never outputs the stop sequence expected and generates until it exhausts max_tokens.

A quick way to test this is to reduce your max_new_tokens and check if the stuck step returns (it may still fail but that is not the point, you want to see if it returns anything at all).

@Bilokin Bilokin changed the title [BUG] Agent will get stuck after a few iterations if used in a loop Agent will get stuck after a few iterations if used in a loop Feb 10, 2025
@Bilokin
Copy link
Contributor Author

Bilokin commented Feb 10, 2025

Thanks, now I think the issue is connected to max_new_tokens rather than the loop.
I used this high token limit in that minimal working example because in my full code I try thinking models, like DeepSeek, that burn through the tokens really fast.
The smaller max_new_tokens values prevent the agent from being stuck on one step because their output is truncated as advertized.

I removed the [BUG] tag, because this is perhaps just my misuse of the agents. I will investigate more on that

@g-eoj
Copy link
Contributor

g-eoj commented Feb 10, 2025

Since this sounds a lot like what I encountered, here are more details of what happened in my case and could explain what happened here as well.

  • The model correctly produces python code that matches the expected regex pattern:
    pattern = r"```(?:py|python)?\n(.*?)\n```"
  • After the python code triple backtick the model is supposed to output <end_code> which will signal to stop model generation. This doesn't happen and the model continues generating.
  • After the model exhausts all tokens (which can take a while), the regex from
    pattern = r"```(?:py|python)?\n(.*?)\n```"
    still matches the final output so the agent executes without error.

In summary the model generates valid python code but doesn't produce a stop sequence. I don't think this a bug either, since the model isn't following the instruction it receives to produce a stop sequence, but it is wasteful of compute resources.

The only sure fire way to prevent the issue I've found is to use a logits processor to force the model to produce structured output that ends with <end_code>.

@sysradium
Copy link
Contributor

@g-eoj yeah, that's what I have assumed is happening. :/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants