feat: workspace env #717

ArslanSaleem · 2023-10-31T18:43:46Z

Change root path to store pandasai cache and charts in different directory

Summary by CodeRabbit

New Features:

Added the ability to customize the project root location via the "PANDASAI_WORKSPACE" environment variable, providing more flexibility in project setup.
Enhanced the Agent class to support chart plotting requests, improving data visualization capabilities.

Refactor:

Updated the execute_code function to handle chart saving scenarios more efficiently.

Tests:

Updated test cases to validate the correct chart saving path, ensuring the new feature works as expected.

Chores:

Removed unused import in df_config.py, improving code cleanliness.

fix(chart): charts to save to save_chart_path (Sourcery refactored)

… config

coderabbitai · 2023-10-31T18:43:55Z

Walkthrough

The changes introduced in this commit primarily revolve around the customization of the project root location and the saving of charts in the pandasai library. The changes allow users to set a custom workspace directory and ensure that the saved charts are stored in the correct directory path.

Changes

File	Summary
`examples/using_workspace_env.py`	The script now imports necessary modules, defines data, sets a custom workspace directory, initializes an OpenAI instance, creates an Agent instance, and calls the `chat` method.
`pandasai/helpers/code_manager.py` `pandasai/helpers/path.py`	The `code_manager.py` file now imports the `find_project_root` function and modifies the `execute_code` function. The `path.py` file checks for the presence of the "PANDASAI_WORKSPACE" environment variable and returns its value as the project root.
`pandasai/schemas/df_config.py`	The import statement for `DEFAULT_CHART_DIRECTORY` has been removed.
`pandasai/smart_datalake/__init__.py`	The `save_charts_path` attribute of the `_config` object is now set to the correct directory path.
`tests/test_smartdataframe.py`	The configuration dictionary now includes a new key-value pair for "save_charts_path". The assertion statement checks if the saved chart path matches the updated path.
`pandasai/exceptions.py`	A new exception class `InvalidWorkspacePathError` has been added.

Poem

🐇 "In the heart of autumn's glow, 🍂

Code changes come, steady flow. 💻

Paths are set, charts are saved, 📊

In the workspace that you've paved. 🛠️

Celebrate this day, so bright, 🎉

With code that's tuned just right!" ✨

Tips

Chat with CodeRabbit Bot (`@coderabbitai`)

If you reply to a review comment from CodeRabbit, the bot will automatically respond.
To engage with CodeRabbit bot directly around the specific lines of code in the PR, mention @coderabbitai in your review comment
Note: Review comments are made on code diffs or files, not on the PR overview.
Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Commands (invoked as PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai help to get help.
@coderabbitai resolve to resolve all the CodeRabbit review comments.

Note: For conversation with the bot, please use the review comments on code diffs or files.

CodeRabbit Configration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
The JSON schema for the configuration file is available here.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.json

coderabbitai

Review Status

Actionable comments generated: 0

Configuration used: CodeRabbit UI

Commits

Files that changed from the base of the PR and between d20204d and 39b0bbe.

Files selected for processing (6)

examples/using_workspace_env.py (1 hunks)
pandasai/helpers/code_manager.py (2 hunks)
pandasai/helpers/path.py (2 hunks)
pandasai/schemas/df_config.py (1 hunks)
pandasai/smart_datalake/init.py (1 hunks)
tests/test_smartdataframe.py (2 hunks)

Files skipped from review due to trivial changes (2)

pandasai/helpers/path.py
pandasai/schemas/df_config.py

Additional comments: 6

pandasai/helpers/code_manager.py (2)

8-8: The import statement for find_project_root is added correctly.

236-245: The execute_code function is updated to handle the case when self._config.save_charts is False. In this case, it calls the add_save_chart function with save_charts_path_str set to the result of the find_project_root function. This change introduces a new behavior when the else block is executed. Ensure that this new behavior is intended and does not introduce any unexpected side effects.

tests/test_smartdataframe.py (2)

476-476: The new key-value pair "save_charts_path": "charts" has been added to the configuration dictionary. Ensure that the directory "charts" exists or is created before saving charts to it.

486-486: The assertion statement has been updated to check if the saved chart path matches the updated path "charts/{smart_dataframe.last_prompt_id}.png". Ensure that the last_prompt_id attribute is correctly set in the smart_dataframe object.

pandasai/smart_datalake/__init__.py (1)

160-166: The code is trying to set the save_charts_path attribute of the _config object to the correct directory path. This is done by joining the project root path (obtained from find_project_root()) with the current save_charts_path. If find_project_root() raises a ValueError, it falls back to using the current working directory (os.getcwd()). This change is consistent with the pull request's goal of allowing the pandasai cache and charts to be stored in a different directory.

examples/using_workspace_env.py (1)

1-38: The code looks good overall. However, ensure that the directory path assigned to the PANDASAI_WORKSPACE environment variable exists and is writable. Also, replace "YOUR_API_KEY" with your actual API key. Be careful not to commit sensitive information like API keys to version control systems.

gventuri · 2023-10-31T20:40:25Z

pandasai/helpers/code_manager.py

+                logger=self._logger,
+                file_name="temp_chart",
+                save_charts_path_str=find_project_root(),
+            )


Do we really want this fallback in place? If save chart is false, we just should save it, what do you think?

gventuri · 2023-10-31T20:41:18Z

pandasai/helpers/path.py

@@ -2,6 +2,9 @@


 def find_project_root(filename=None):
+    if "PANDASAI_WORKSPACE" in os.environ:
+        return os.environ["PANDASAI_WORKSPACE"]


Maybe we can check it it's a valid path, otherwise raising an error?

coderabbitai

Review Status

Actionable comments generated: 0

Configuration used: CodeRabbit UI

Commits

Files that changed from the base of the PR and between 39b0bbe and 38e3f77.

Files selected for processing (4)

pandasai/exceptions.py (1 hunks)
pandasai/helpers/code_manager.py (2 hunks)
pandasai/helpers/path.py (2 hunks)
pandasai/smart_datalake/init.py (2 hunks)

Files skipped from review due to trivial changes (1)

pandasai/exceptions.py

Additional comments: 5

pandasai/helpers/code_manager.py (1)

236-246: The execute_code function has been updated to handle the case when self._config.save_charts is False. In this case, it temporarily saves the generated chart to a directory determined by the find_project_root function. This is a good practice as it allows for temporary storage of charts for display without affecting the main chart saving functionality.

pandasai/smart_datalake/__init__.py (2)

160-166: The code is trying to find the project root directory to save charts. If it fails, it falls back to the current working directory. This is a good practice as it provides a fallback mechanism in case the project root cannot be determined. However, it's important to ensure that the application has the necessary permissions to create directories and write files in these locations.

439-441: The break statement is used to exit the loop once the code is successfully executed. This is a good practice as it prevents unnecessary iterations of the loop.

pandasai/helpers/path.py (2)

1-21: The code checks if the "PANDASAI_WORKSPACE" environment variable is set and if it points to a valid directory. If not, it raises an InvalidWorkspacePathError. This is a good practice as it allows users to customize the workspace path and provides clear error messages when the path is invalid.

45-52: > Note: This review was outside of the patch, so it was mapped to the patch with the greatest overlap. Original lines [23-52]

The function find_project_root is used to find the root directory of the project. It starts from the current working directory and moves up the directory tree until it finds a directory containing one of the specified files ("pyproject.toml", "setup.py", "requirements.txt", "pandasai.json") or a custom filename if provided. If it doesn't find a project root, it returns the current working directory. This is a common and effective way to find the project root.

codecov-commenter · 2023-11-01T12:42:50Z

Codecov Report

Merging #717 (38e3f77) into main (d20204d) will decrease coverage by 0.17%.
Report is 3 commits behind head on main.
The diff coverage is 63.63%.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

@@            Coverage Diff             @@
##             main     #717      +/-   ##
==========================================
- Coverage   85.39%   85.22%   -0.17%     
==========================================
  Files          73       73              
  Lines        3553     3567      +14     
==========================================
+ Hits         3034     3040       +6     
- Misses        519      527       +8

Files	Coverage Δ
pandasai/exceptions.py	`92.00% <100.00%> (+0.33%)`	⬆️
pandasai/helpers/code_manager.py	`91.66% <100.00%> (+0.06%)`	⬆️
pandasai/schemas/df_config.py	`100.00% <ø> (ø)`
pandasai/smart_datalake/__init__.py	`93.66% <100.00%> (+0.01%)`	⬆️
pandasai/helpers/path.py	`73.33% <42.85%> (-10.00%)`	⬇️

... and 10 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

* feat(pipeline): Add pipeline to generate synthetic dataframe * chore(pipeline): maintain documentation and other flows * feat(Pipeline): test case for pipeline * feat(cache): adding cache in pipeline context and fix leftovers * chore(pipeline): rename and add dependency * update poetry lock file * refactor: minor changes from the code review Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * chore: update pipeline_context.py * chore: use PandasAI logger instead of default one * refactor: prompt for synthetic data now accepts the amount params * remove extra print statement * 'Refactored by Sourcery' (#703) Co-authored-by: Sourcery AI <> * chore(pipeline): improve pipeline usage remove passing config to pipeline * feat: config plot libraries (#705) * In this commit, I introduced a new configuration parameter in our application settings that allows users to define their preferred data visualization library (matplotlib, seaborn, or plotly). With this update, I've eliminated the need for the user to specify in every prompt which library to use, thereby simplifying their interaction with the application and increasing its versatility. * This commit adds a configuration parameter for users to set their preferred data visualization library (matplotlib, seaborn, or plotly), simplifying interactions and enhancing the application's versatility. * viz_library_type' in test_generate_python_code_prompt.py, resolved failing tests --------- Co-authored-by: sabatino.severino <qrxqfspfibrth6nxywai2qifza6jmskt222howzew43risnx4kva> Co-authored-by: Gabriele Venturi <[email protected]> * build: use ruff for formatting * feat: add add_message method to the agent * Release v1.4.3 * feat: workspace env (#717) * fix(chart): charts to save to save_chart_path * refactor sourcery changes * 'Refactored by Sourcery' * refactor chart save code * fix: minor leftovers * feat(workspace_env): add workspace env to store cache, temp chart and config * add error handling and comments --------- Co-authored-by: Sourcery AI <> --------- Co-authored-by: Gabriele Venturi <[email protected]> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com> Co-authored-by: Sab Severino <[email protected]>

* feat: config plot libraries (#705) * In this commit, I introduced a new configuration parameter in our application settings that allows users to define their preferred data visualization library (matplotlib, seaborn, or plotly). With this update, I've eliminated the need for the user to specify in every prompt which library to use, thereby simplifying their interaction with the application and increasing its versatility. * This commit adds a configuration parameter for users to set their preferred data visualization library (matplotlib, seaborn, or plotly), simplifying interactions and enhancing the application's versatility. * viz_library_type' in test_generate_python_code_prompt.py, resolved failing tests --------- Co-authored-by: sabatino.severino <qrxqfspfibrth6nxywai2qifza6jmskt222howzew43risnx4kva> Co-authored-by: Gabriele Venturi <[email protected]> * build: use ruff for formatting * feat: add add_message method to the agent * Release v1.4.3 * feat: workspace env (#717) * fix(chart): charts to save to save_chart_path * refactor sourcery changes * 'Refactored by Sourcery' * refactor chart save code * fix: minor leftovers * feat(workspace_env): add workspace env to store cache, temp chart and config * add error handling and comments --------- Co-authored-by: Sourcery AI <> * fix: hallucinations was plotting when not asked * Release v1.4.4 * feat(sqlConnector): add direct config run sql at runtime * feat(DirectSqlConnector): add sql test cases * fix: minor leftovers * fix(orders): check examples of different tables * 'Refactored by Sourcery' * chore(sqlprompt): add description only when we have it --------- Co-authored-by: Sab Severino <[email protected]> Co-authored-by: Gabriele Venturi <[email protected]> Co-authored-by: Sourcery AI <>

ArslanSaleem and others added 8 commits October 31, 2023 00:19

fix(chart): charts to save to save_chart_path

9de41a5

refactor sourcery changes

bd1786e

'Refactored by Sourcery'

d1452cf

Merge pull request #711 from gventuri/sourcery/fix/chart_save

2847ade

fix(chart): charts to save to save_chart_path (Sourcery refactored)

refactor chart save code

df4087b

fix: minor leftovers

3a5cd93

feat(workspace_env): add workspace env to store cache, temp chart and…

c5a8d83

… config

Merge branch 'main' into add_workspace_env

39b0bbe

coderabbitai bot reviewed Oct 31, 2023

View reviewed changes

ArslanSaleem requested a review from gventuri October 31, 2023 19:15

gventuri reviewed Nov 1, 2023

View reviewed changes

add error handling and comments

38e3f77

coderabbitai bot reviewed Nov 1, 2023

View reviewed changes

gventuri changed the title ~~PandasAI workspace env~~ feat: workspace env Nov 1, 2023

gventuri merged commit 451a843 into main Nov 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: workspace env #717

feat: workspace env #717

ArslanSaleem commented Oct 31, 2023 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 31, 2023 •

edited

Loading

Chat with CodeRabbit Bot (`@coderabbitai`)

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configration File (`.coderabbit.yaml`)

coderabbitai bot left a comment

gventuri Oct 31, 2023

gventuri Oct 31, 2023

coderabbitai bot left a comment

codecov-commenter commented Nov 1, 2023

feat: workspace env #717

feat: workspace env #717

Conversation

ArslanSaleem commented Oct 31, 2023 • edited by coderabbitai bot Loading

Summary by CodeRabbit

coderabbitai bot commented Oct 31, 2023 • edited Loading

Walkthrough

Changes

Poem

Chat with CodeRabbit Bot (@coderabbitai)

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configration File (.coderabbit.yaml)

coderabbitai bot left a comment

Choose a reason for hiding this comment

gventuri Oct 31, 2023

Choose a reason for hiding this comment

gventuri Oct 31, 2023

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

codecov-commenter commented Nov 1, 2023

Codecov Report

ArslanSaleem commented Oct 31, 2023 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 31, 2023 •

edited

Loading

Chat with CodeRabbit Bot (`@coderabbitai`)

CodeRabbit Configration File (`.coderabbit.yaml`)