-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature(LLMJudge): adding set of integration test where OpanAI is used as LLM Judge #1546
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…into release/v3 * 'release/v3' of https://github.com/Sinaptik-AI/pandas-ai: fix(VirtualDataframe): fixing virtual dataframe name conflict (Sinaptik-AI#1531)
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
* commit '4db35d2b278559eef8b6d5ad15bdc1d8a3936855': cd: fix poetry version cd: use python 3.10 Release v3.0.0-beta.0 docs: udpated gif in readme 3.0 (Sinaptik-AI#1536)
* commit '7967d07f90866edc71f12fdac61761d0946712f8': fix: dependency issue and updated lock files (Sinaptik-AI#1539)
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
… into fix/SIN-310 * 'fix/SIN-310' of https://github.com/scaliseraoul/pandas-ai: Update pandasai/data_loader/loader.py
dosubot
bot
added
the
size:L
This PR changes 100-499 lines, ignoring generated files.
label
Jan 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Reviewed everything up to 55bcfc3 in 1 minute and 59 seconds
More details
- Looked at
473
lines of code in4
files - Skipped
1
files when reviewing. - Skipped posting
5
drafted comments based on config settings.
1. tests/unit_tests/agent/test_agent_chat.py:4
- Draft comment:
from typing import List, Tuple, Union
- Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The code is using Python 3.10+ type union syntax with the | operator and imports UnionType from types module. There's no usage of typing.Union in the code. The current imports are correct for the code's needs. Adding Union would be unnecessary and potentially confusing since it's not being used.
Could there be a reason to prefer typing.Union over the | operator for compatibility or readability reasons?
No - the | operator is the modern, preferred syntax in Python 3.10+ and is more readable. The code is clearly using this modern syntax consistently.
The comment should be deleted as the current imports are correct for how unions are being used in the code.
2. tests/unit_tests/agent/test_agent_llm_judge.py:147
- Draft comment:
"""Test multiple heart stroke-related questions."""
- Reason this comment was not posted:
Confidence changes required:50%
The test methodtest_heart_strokes_questions
has a docstring that incorrectly mentions 'loan-related questions' instead of 'heart stroke-related questions'. This is a copy-paste error and should be corrected for clarity.
3. pandasai/data_loader/loader.py:173
- Draft comment:
f"Failed to execute query for source type '{source_type}' with query: {formatted_query}"
- Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The original error message is slightly more verbose but both versions communicate the same information effectively. The new version is more concise without losing clarity. The change doesn't impact functionality or comprehension. This seems like a purely stylistic preference.
The original message might be marginally more explicit by including the words "source type" and "query". Could this extra clarity be important for debugging?
The new message still clearly indicates what failed (query execution), what the source type was, and includes the full query. The removed words don't add meaningful clarity.
This comment should be deleted as it suggests reverting a minor stylistic change that doesn't impact functionality or clarity of the error message.
4. tests/unit_tests/agent/test_agent_chat.py:233
- Draft comment:
Test heart stroke related questions to ensure the response types match the expected ones.
- Reason this comment was not posted:
Confidence changes required:50%
The test filetest_agent_chat.py
has a typo in the docstring fortest_heart_strokes_questions_type
. It should be corrected for clarity.
5. tests/unit_tests/agent/test_agent_llm_judge.py:147
- Draft comment:
Test multiple heart stroke-related questions.
- Reason this comment was not posted:
Confidence changes required:50%
The test filetest_agent_llm_judge.py
has a typo in the docstring fortest_heart_strokes_questions
. It should be corrected for clarity.
Workflow ID: wflow_7fPe5Kw1Ui01BqyD
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Important
Add integration tests using OpenAI as LLM Judge and update dependencies.
test_agent_chat.py
for integration tests using OpenAI as LLM Judge.test_agent_llm_judge.py
for evaluating code snippets with OpenAI.openai
topyproject.toml
dependencies.execute_query()
inloader.py
.This description was created by for 55bcfc3. It will automatically update as commits are pushed.