Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: workspace env #717

Merged
merged 9 commits into from
Nov 1, 2023
Merged

feat: workspace env #717

merged 9 commits into from
Nov 1, 2023

Conversation

ArslanSaleem
Copy link
Collaborator

@ArslanSaleem ArslanSaleem commented Oct 31, 2023

  • Change root path to store pandasai cache and charts in different directory

Summary by CodeRabbit

New Features:

  • Added the ability to customize the project root location via the "PANDASAI_WORKSPACE" environment variable, providing more flexibility in project setup.
  • Enhanced the Agent class to support chart plotting requests, improving data visualization capabilities.

Refactor:

  • Updated the execute_code function to handle chart saving scenarios more efficiently.

Tests:

  • Updated test cases to validate the correct chart saving path, ensuring the new feature works as expected.

Chores:

  • Removed unused import in df_config.py, improving code cleanliness.

Copy link
Contributor

coderabbitai bot commented Oct 31, 2023

Walkthrough

The changes introduced in this commit primarily revolve around the customization of the project root location and the saving of charts in the pandasai library. The changes allow users to set a custom workspace directory and ensure that the saved charts are stored in the correct directory path.

Changes

File Summary
examples/using_workspace_env.py The script now imports necessary modules, defines data, sets a custom workspace directory, initializes an OpenAI instance, creates an Agent instance, and calls the chat method.
pandasai/helpers/code_manager.py
pandasai/helpers/path.py
The code_manager.py file now imports the find_project_root function and modifies the execute_code function. The path.py file checks for the presence of the "PANDASAI_WORKSPACE" environment variable and returns its value as the project root.
pandasai/schemas/df_config.py The import statement for DEFAULT_CHART_DIRECTORY has been removed.
pandasai/smart_datalake/__init__.py The save_charts_path attribute of the _config object is now set to the correct directory path.
tests/test_smartdataframe.py The configuration dictionary now includes a new key-value pair for "save_charts_path". The assertion statement checks if the saved chart path matches the updated path.
pandasai/exceptions.py A new exception class InvalidWorkspacePathError has been added.

Poem

🐇 "In the heart of autumn's glow, 🍂

Code changes come, steady flow. 💻

Paths are set, charts are saved, 📊

In the workspace that you've paved. 🛠️

Celebrate this day, so bright, 🎉

With code that's tuned just right!" ✨


Tips

Chat with CodeRabbit Bot (@coderabbitai)

  • If you reply to a review comment from CodeRabbit, the bot will automatically respond.
  • To engage with CodeRabbit bot directly around the specific lines of code in the PR, mention @coderabbitai in your review comment
  • Note: Review comments are made on code diffs or files, not on the PR overview.
  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai help to get help.
  • @coderabbitai resolve to resolve all the CodeRabbit review comments.

Note: For conversation with the bot, please use the review comments on code diffs or files.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.json

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between d20204d and 39b0bbe.
Files selected for processing (6)
  • examples/using_workspace_env.py (1 hunks)
  • pandasai/helpers/code_manager.py (2 hunks)
  • pandasai/helpers/path.py (2 hunks)
  • pandasai/schemas/df_config.py (1 hunks)
  • pandasai/smart_datalake/init.py (1 hunks)
  • tests/test_smartdataframe.py (2 hunks)
Files skipped from review due to trivial changes (2)
  • pandasai/helpers/path.py
  • pandasai/schemas/df_config.py
Additional comments: 6
pandasai/helpers/code_manager.py (2)
  • 8-8: The import statement for find_project_root is added correctly.

  • 236-245: The execute_code function is updated to handle the case when self._config.save_charts is False. In this case, it calls the add_save_chart function with save_charts_path_str set to the result of the find_project_root function. This change introduces a new behavior when the else block is executed. Ensure that this new behavior is intended and does not introduce any unexpected side effects.

tests/test_smartdataframe.py (2)
  • 476-476: The new key-value pair "save_charts_path": "charts" has been added to the configuration dictionary. Ensure that the directory "charts" exists or is created before saving charts to it.

  • 486-486: The assertion statement has been updated to check if the saved chart path matches the updated path "charts/{smart_dataframe.last_prompt_id}.png". Ensure that the last_prompt_id attribute is correctly set in the smart_dataframe object.

pandasai/smart_datalake/__init__.py (1)
  • 160-166: The code is trying to set the save_charts_path attribute of the _config object to the correct directory path. This is done by joining the project root path (obtained from find_project_root()) with the current save_charts_path. If find_project_root() raises a ValueError, it falls back to using the current working directory (os.getcwd()). This change is consistent with the pull request's goal of allowing the pandasai cache and charts to be stored in a different directory.
examples/using_workspace_env.py (1)
  • 1-38: The code looks good overall. However, ensure that the directory path assigned to the PANDASAI_WORKSPACE environment variable exists and is writable. Also, replace "YOUR_API_KEY" with your actual API key. Be careful not to commit sensitive information like API keys to version control systems.

@ArslanSaleem ArslanSaleem requested a review from gventuri October 31, 2023 19:15
logger=self._logger,
file_name="temp_chart",
save_charts_path_str=find_project_root(),
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really want this fallback in place? If save chart is false, we just should save it, what do you think?

@@ -2,6 +2,9 @@


def find_project_root(filename=None):
if "PANDASAI_WORKSPACE" in os.environ:
return os.environ["PANDASAI_WORKSPACE"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can check it it's a valid path, otherwise raising an error?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 39b0bbe and 38e3f77.
Files selected for processing (4)
  • pandasai/exceptions.py (1 hunks)
  • pandasai/helpers/code_manager.py (2 hunks)
  • pandasai/helpers/path.py (2 hunks)
  • pandasai/smart_datalake/init.py (2 hunks)
Files skipped from review due to trivial changes (1)
  • pandasai/exceptions.py
Additional comments: 5
pandasai/helpers/code_manager.py (1)
  • 236-246: The execute_code function has been updated to handle the case when self._config.save_charts is False. In this case, it temporarily saves the generated chart to a directory determined by the find_project_root function. This is a good practice as it allows for temporary storage of charts for display without affecting the main chart saving functionality.
pandasai/smart_datalake/__init__.py (2)
  • 160-166: The code is trying to find the project root directory to save charts. If it fails, it falls back to the current working directory. This is a good practice as it provides a fallback mechanism in case the project root cannot be determined. However, it's important to ensure that the application has the necessary permissions to create directories and write files in these locations.

  • 439-441: The break statement is used to exit the loop once the code is successfully executed. This is a good practice as it prevents unnecessary iterations of the loop.

pandasai/helpers/path.py (2)
  • 1-21: The code checks if the "PANDASAI_WORKSPACE" environment variable is set and if it points to a valid directory. If not, it raises an InvalidWorkspacePathError. This is a good practice as it allows users to customize the workspace path and provides clear error messages when the path is invalid.

  • 45-52: > Note: This review was outside of the patch, so it was mapped to the patch with the greatest overlap. Original lines [23-52]

The function find_project_root is used to find the root directory of the project. It starts from the current working directory and moves up the directory tree until it finds a directory containing one of the specified files ("pyproject.toml", "setup.py", "requirements.txt", "pandasai.json") or a custom filename if provided. If it doesn't find a project root, it returns the current working directory. This is a common and effective way to find the project root.

@codecov-commenter
Copy link

Codecov Report

Merging #717 (38e3f77) into main (d20204d) will decrease coverage by 0.17%.
Report is 3 commits behind head on main.
The diff coverage is 63.63%.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

@@            Coverage Diff             @@
##             main     #717      +/-   ##
==========================================
- Coverage   85.39%   85.22%   -0.17%     
==========================================
  Files          73       73              
  Lines        3553     3567      +14     
==========================================
+ Hits         3034     3040       +6     
- Misses        519      527       +8     
Files Coverage Δ
pandasai/exceptions.py 92.00% <100.00%> (+0.33%) ⬆️
pandasai/helpers/code_manager.py 91.66% <100.00%> (+0.06%) ⬆️
pandasai/schemas/df_config.py 100.00% <ø> (ø)
pandasai/smart_datalake/__init__.py 93.66% <100.00%> (+0.01%) ⬆️
pandasai/helpers/path.py 73.33% <42.85%> (-10.00%) ⬇️

... and 10 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@gventuri gventuri changed the title PandasAI workspace env feat: workspace env Nov 1, 2023
@gventuri gventuri merged commit 451a843 into main Nov 1, 2023
gventuri added a commit that referenced this pull request Nov 1, 2023
* feat(pipeline): Add pipeline to generate synthetic dataframe

* chore(pipeline): maintain documentation and other flows

* feat(Pipeline): test case for pipeline

* feat(cache): adding cache in pipeline context and fix leftovers

* chore(pipeline): rename and add dependency

* update poetry lock file

* refactor: minor changes from the code review

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* chore: update pipeline_context.py

* chore: use PandasAI logger instead of default one

* refactor: prompt for synthetic data now accepts the amount params

* remove extra print statement

* 'Refactored by Sourcery' (#703)

Co-authored-by: Sourcery AI <>

* chore(pipeline): improve pipeline usage remove passing config to pipeline

* feat: config plot libraries (#705)

* In this commit, I introduced a new configuration parameter in our application settings that allows users to define their preferred data visualization library (matplotlib, seaborn, or plotly).
With this update, I've eliminated the need for the user to specify in every prompt which library to use, thereby simplifying their interaction with the application and increasing its versatility.

* This commit adds a configuration parameter for users to set their preferred data visualization library (matplotlib, seaborn, or plotly), simplifying interactions and enhancing the application's versatility.

* viz_library_type' in test_generate_python_code_prompt.py, resolved failing tests

---------

Co-authored-by: sabatino.severino <qrxqfspfibrth6nxywai2qifza6jmskt222howzew43risnx4kva>
Co-authored-by: Gabriele Venturi <[email protected]>

* build: use ruff for formatting

* feat: add add_message method to the agent

* Release v1.4.3

* feat: workspace env (#717)

* fix(chart): charts to save to save_chart_path

* refactor sourcery changes

* 'Refactored by Sourcery'

* refactor chart save code

* fix: minor leftovers

* feat(workspace_env): add workspace env to store cache, temp chart and config

* add error handling and comments

---------

Co-authored-by: Sourcery AI <>

---------

Co-authored-by: Gabriele Venturi <[email protected]>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
Co-authored-by: Sab Severino <[email protected]>
gventuri added a commit that referenced this pull request Nov 7, 2023
* feat: config plot libraries (#705)

* In this commit, I introduced a new configuration parameter in our application settings that allows users to define their preferred data visualization library (matplotlib, seaborn, or plotly).
With this update, I've eliminated the need for the user to specify in every prompt which library to use, thereby simplifying their interaction with the application and increasing its versatility.

* This commit adds a configuration parameter for users to set their preferred data visualization library (matplotlib, seaborn, or plotly), simplifying interactions and enhancing the application's versatility.

* viz_library_type' in test_generate_python_code_prompt.py, resolved failing tests

---------

Co-authored-by: sabatino.severino <qrxqfspfibrth6nxywai2qifza6jmskt222howzew43risnx4kva>
Co-authored-by: Gabriele Venturi <[email protected]>

* build: use ruff for formatting

* feat: add add_message method to the agent

* Release v1.4.3

* feat: workspace env (#717)

* fix(chart): charts to save to save_chart_path

* refactor sourcery changes

* 'Refactored by Sourcery'

* refactor chart save code

* fix: minor leftovers

* feat(workspace_env): add workspace env to store cache, temp chart and config

* add error handling and comments

---------

Co-authored-by: Sourcery AI <>

* fix: hallucinations was plotting when not asked

* Release v1.4.4

* feat(sqlConnector): add direct config run sql at runtime

* feat(DirectSqlConnector): add sql test cases

* fix: minor leftovers

* fix(orders): check examples of different tables

* 'Refactored by Sourcery'

* chore(sqlprompt): add description only when we have it

---------

Co-authored-by: Sab Severino <[email protected]>
Co-authored-by: Gabriele Venturi <[email protected]>
Co-authored-by: Sourcery AI <>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants