Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changed Directory structure, small fix, prompts #3850

Merged
merged 34 commits into from
Feb 1, 2025

Conversation

joachim-danswer
Copy link
Contributor

Description

Changed Directory structure, small fix, prompts

How Has This Been Tested?

Locally

Backporting (check the box to trigger backport action)

Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.

  • This PR should be backported (make sure to check that the backport attempt succeeds)
  • [Optional] Override Linear Check

Copy link

vercel bot commented Jan 30, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
internal-search ✅ Ready (Inspect) Visit Preview 💬 Add feedback Feb 1, 2025 5:27am

Copy link
Contributor

@evan-danswer evan-danswer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Directory structure looks a lot better! Asked ChatGPT about a few I took issue with, but mostly it's time to focus on filename/function/variable renames now. Check out the comments on the main PR #3749 before you start that, since Yuhong had some things to say on that front

@@ -3,10 +3,10 @@

from langgraph.types import Send

from onyx.agents.agent_search.deep_search_a.initial__individual_sub_answer__subgraph.states import (
from onyx.agents.agent_search.deep_search_a.initial.individual_sub_answer_generation.states import (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLM recommends individual_sub_answer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here was the prompt:
For a langgraph graph created with the purpose of answering a subquestion, choose the best folder name or create your own that you believe is better:

  1. individual_sub_answer
  2. individual_sub_answer_generation
  3. answer_subquestion

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^^ the above was using Cursor with knowledge of our current codebase, plain ChatGPT picked answer_subquestion. Seems like either is fine

@@ -102,7 +102,9 @@ def answer_generation(
)

answer_citation_ids = get_answer_citation_ids(answer_str)
cited_docs = [context_docs[id] for id in answer_citation_ids]
cited_docs = [
context_docs[id] for id in answer_citation_ids if id < len(context_docs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of bounds citation should never happen unless the LLM is really weak, is this a difference in the docs we give the LLM vs the context docs here? Probably we prefer failing loudly

generate_initial_answer,
)
from onyx.agents.agent_search.deep_search_a.initial__retrieval_sub_answers__subgraph.nodes.initial_answer_quality_check import (
from onyx.agents.agent_search.deep_search_a.initial.initial_answer_generation.nodes.initial_answer_quality_check import (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor thinks we should go with "individual_sub_answer" which I think is pretty weird, ChatGPT likes your choice but also thinks "generate_first_answer" is good. I'm inclined to stick with your choice since it's backed up by ChatGPT and doesn't seem clearly wrong like individual_sub_answer

prompt: For a langgraph graph created with the purpose of creating the first answer to a user question, choose the best folder name or create your own that you believe is better (keep in mind that this will be a subfolder of a new directory "initial" that contains the subgraphs used for gathering information and generating the first answer to the user question):

  1. retrieval_sub_answers
  2. initial_answer_generation
  3. answer_question

AnswerQuestionState,
)
from onyx.agents.agent_search.deep_search_a.refinement__consolidate_sub_answers__subgraph.edges import (
from onyx.agents.agent_search.deep_search_a.refininement.sub_answer_consolidation.edges import (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refininement -> refinement


Make sure that you keep all relevant information, specifically as it concerns to the ultimate goal.
(But keep other details as well.)
It is a matter of life and death that you do NOT use your internal knowledge, just the provided`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

provided has a ` at the end

So a good idea would be to \n use the sub-questions to resolve ambiguities and/or to separate the
question for different entities that may be involved in the original question.
For an initial user question, please generate at 5-10 individual sub-questions whose answers would help
\n to answer the initial question. The individual questions should be answerable by a good RAG system.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unnecessary \n

question for different entities that may be involved in the original question.
For an initial user question, please generate at 5-10 individual sub-questions whose answers would help
\n to answer the initial question. The individual questions should be answerable by a good RAG system.
So a good idea would be to \n use the sub-questions to resolve ambiguities and/or to separate the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unnecessary \n

@@ -673,9 +712,9 @@
{history}

Here is the initial question:
-------
\n -------\n
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you add \n in a triple quoted string, you're adding extra newlines in addition to the ones already present, i.e. it looks like

Here is the initial question:

/-------/

question

/--------/

If that's what you're going for though, that's fine

- new bullet points
- substantially more document citations ([[D1]](), [[D2]](), [[D3]](), etc.)

Put yourself in the shoes of the user and think about whether the refined answer is really substantially
better than the initial answer.
better and delivers really new insights than the initial answer.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better than and delivers really new insights from the initial answer

You need to summarize the key parts of the history of a conversation between a user and an agent. The
summary has two purposes:
1) providing the suitable context for a new question, and
2) To capture the key information that was discussed and that the user may have a follow-up question about.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, change 2) to "capturing" or 1) to "To provide"

@joachim-danswer joachim-danswer force-pushed the agent-search-feature-jr6 branch from f87c173 to f90bac5 Compare January 30, 2025 18:33
@evan-danswer evan-danswer merged commit 96aa174 into agent-search-feature Feb 1, 2025
3 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants