Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented Recursive Criticism and Iteration in Task Creation to Verify the output of the agents #785

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

chandrakanth137
Copy link

RCI Documentation

Documentation for Recursive Criticism and Iteration (RCI) Methods

Overview

Recursive Criticism and Iteration (RCI) is a systematic process used to iteratively enhance the quality of outputs generated by a language model (LLM). It involves three main steps: Critique, Validate, and Improve. Each step is designed to ensure that the final output is accurate, logically sound, and free of factual errors.

This documentation explains the implementation of RCI in CrewAI library using LangChain and Ollama LLM. The code defines three methods: critique, validate, and improve. Each method leverages prompt templates to interact with the LLM and achieve the desired processing at each stage.

Methods

1. critique

This method generates a critique of the given output based on the task description. The critique focuses solely on logical or factual inaccuracies, avoiding grammatical rephrasing or paraphrasing.

Parameters:

  • agent: The agent performing the task, which includes the agent's backstory.
  • task: The task description provided to the model.
  • output: The output generated by the LLM for the given task.
  • llm: The language model used to process the prompt and generate the critique.

2. validate

This method determines if the critique suggests significant changes to the output. It analyzes the critique to see if it indicates that substantial revisions are necessary.

Parameters:

  • task: The task description provided to the model.
  • critique: The critique generated by the critique method.
  • output: The original output generated by the LLM.
  • llm: The language model used to process the prompt and validate the critique.

Returns:

  • validate_response: A single word response, either "True" or "False", indicating whether significant changes are required.

3. improve

This method refines the original output based on the critique. It rewrites the output to address the errors identified in the critique, ensuring the format specified in the task description is maintained.

Parameters:

  • task: The task description provided to the model.
  • output: The original output generated by the LLM.
  • critique: The critique generated by the critique method.
  • llm: The language model used to process the prompt and improve the output.

Returns:

  • improve_response: The improved output generated by the LLM.

Summary

The RCI methods (critique, validate, and improve) form a robust framework for iteratively refining LLM outputs. By identifying and correcting logical or factual errors and validating the significance of required changes, this approach ensures high-quality and accurate outputs tailored to specific task requirements.

To test the functioning RCI in the modified CrewAI

Note: The test code written is suited for local development scenario, kindly modify the code in necessary ways for your working.

  1. Move the src directory
  2. Run the crew_ai_base_test.py to check if your initial setup works
  3. Run this Python notebook for advanced test case: YT_Email_Reply_Llama3_CrewAI_+_Groq.ipynb
    • For the above code to work, Groq API is required, make sure you have one before starting
  4. To try out custom test cases, RCI can be enabled while creating an instance of the Task Class
    • Set rci=True if you want to use RCI, the default value is True.
    • The # of iterations can be modified using the rci_depth parameter, which takes an integer value. The default value is set to 1.

Example:

task = Task(
    description= """ Your Task Description""",
    agent= your_agent,
    expected_output= "However you wish",
    rci=True
    rci_depth=3
)

result = agent.execute_task(
task=task,
context=context,
tools=tools,
)

# To perform RCI if rci is set to True
llm = ChatOllama(model="llama3")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really glad to see this capability coming to CrewAI! This will probably need to be more generic; I doubt a hardcoded dependency on Ollama will get approved since the library doesn't have one anywhere else.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try making it more generic. I only had access to local LLMs, so that piece of code ended up there. I will modify it to use the user preferred LLM and make the code more generic. Thanks for the review !

@joaomdmoura
Copy link
Collaborator

Just got my eyes on this! Very curious about it, little busy today/tomorrow, but bumping this to the top of the list so either myself or someone on the team looks at it!

@chandrakanth137
Copy link
Author

As @mbarnathan pointed out to make the code more generic, I have modified the code to use the existing LLM from the user provided which would usually be initialized using crew.kickoff().

As I have access only to local LLMs, it would be great if anyone can run the internal tests. I tried using the local LLMs but it eventually came to do modifying the test cases to setup for local LLMs.

I will probably try to modify the test cases to even work with local LLMs.

@theCyberTech theCyberTech added the feature-request New feature or request label Aug 10, 2024
Copy link

This PR is stale because it has been open for 45 days with no activity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request New feature or request no-pr-activity
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants