fixes/in release2.0 (#951)

* fix(output_type): handle errors for wrong output type (#866) * fix(output_type): handle errors for output type * fix: leftovers * fix: test case to mock format-response * fix: upgrade duckdb * Release v1.5.15 * fix(sql): use only added tables of connector (#869) * fix(sql): use only added tables of connector * leftover file * chore: rephrase the error message Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * fix(sql): fix test cases and improve output error message --------- Co-authored-by: Gabriele Venturi <[email protected]> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * feat/integration_testing test cases created (#873) * feat/integration_testing test cases created based Loan Payments data * feat/integration_testing four more datasets added * fix/moved csv datasets to integration folder * fix/changed pytest command to run only in tests folder * fix/changed pytest command to run only in tests folder --------- Co-authored-by: Milind Lalwani <[email protected]> * feat(helpers): add gpt-3.5-turbo-1106 fine-tuned (#876) * fix: rephrase query (#872) * Fixed Agent rephrase_query Error * Fixed Agent rephrase_query Error * add encoding --------- Co-authored-by: Pranab Pathak <[email protected]> * Release v1.5.16 * fix(code manager): parsing of called functions (#883) * feat(helpers): add gpt-3.5-turbo-1106 fine-tuned * fix(code manager): parsing of called functions * fmt * fix: open charts return on format plot (#881) Co-authored-by: Long Le <[email protected]> * feat(project): add Makefile, re-lint project, restructure tests (#884) * feat(project): add Makefile, re-lint project, restructure tests * unused imports * Release v1.5.17 * docs:pdate examples.md (#887) Minor fix of exmaples.md * refactor: TypeVar for IResponseParser (#889) (#890) * refactor: TypeVar for IResponseParser (#889) * (refactor): introduce TypeVar for IResponseParser implementation in output_logic_unit.py * (fix): add missing call of super().__init__() in ProcessOutput class * refactor: TypeVar for IResponseParser (#889) * (style): linter fail at output_logic_unit.py * [fix] logging chart saving only if code contains chart (#897) Co-authored-by: Lorenzobattistela <[email protected]> * fix(airtable): use personal access token instead of api key Api key has been deprecated: https://airtable.com/developers/web/api/authentication * feat: add df summarization shortcut (#901) * fix: badge for "Open in Colab" (#903) * feat: add support Google Gemini API in LLMs (#902) * Updating shortcuts to include df summarization * Updating support for google gemini models * Make google-generativeai package optional * fix: upgrade google-ai --------- Co-authored-by: Gabriele Venturi <[email protected]> * Release v1.5.18 * docs: update Google Colab Badge (#914) The existing badge's signature was somehow seems to be expired so just add Google Colab's officially provided svg badge. * docs: Rectify code examples by adding missing statements (#915) Anyone who is quite new to python won't be able to simplify code errors when directly copied code demo from official website. Updated code example is taken from the root README.md and tested. * feat: add support for modin (#907) * feat: add support for modin * fix(ci): dev deps * update docs * add some tests * upate contributing guidelines * fix helpers * fix docs example * update pandasai/smart_dataframe/__init__.py * Release v1.5.19 * feat: update OpenAI pricing * chore: add Flask openai example (#941) * 'Refactored by Sourcery' * chore: add Flask html example * chore: add Flask openai example * sourcery refactor integration tests * fix: Flask package install set to optional --------- Co-authored-by: Sourcery AI <> * chore: restore smart dataframe and smartdatalake functionalities * fix ruff formatting * fix: yahoo connector * fix: file import sorting * fix: import sorting * fix: module import * ignore integration_tests * fix: ruff * ruff fix * remove integration folder * fix: ci workflow * fix: modin * fix: ruff imports * fix(plot): always pass one lib for plotting in updated prompt * fix: query tracker track code execution * fix: add skills in query tracker and rag to return one sample by default * fix: function call check and query tracker tracking --------- Co-authored-by: Gabriele Venturi <[email protected]> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Co-authored-by: milind-sinaptik <[email protected]> Co-authored-by: Milind Lalwani <[email protected]> Co-authored-by: Massimiliano Pronesti <[email protected]> Co-authored-by: Pranab1011 <[email protected]> Co-authored-by: Pranab Pathak <[email protected]> Co-authored-by: Lh Long <[email protected]> Co-authored-by: Long Le <[email protected]> Co-authored-by: PVA <[email protected]> Co-authored-by: Ihor <[email protected]> Co-authored-by: Lorenzo Battistela <[email protected]> Co-authored-by: Lorenzobattistela <[email protected]> Co-authored-by: Sparsh Jain <[email protected]> Co-authored-by: Devashish Datt Mamgain <[email protected]> Co-authored-by: Hemant Sachdeva <[email protected]> Co-authored-by: aloha-fim <[email protected]>
sinaptik-ai · Feb 21, 2024 · 4a2a9fe · 4a2a9fe
1 parent f73138e
commit 4a2a9fe
Show file tree

Hide file tree

Showing 6 changed files with 56 additions and 20 deletions.
diff --git a/pandasai/helpers/code_manager.py b/pandasai/helpers/code_manager.py
@@ -46,6 +46,24 @@ def __init__(
         self.prompt_id = prompt_id
 
 
+class FunctionCallVisitor(ast.NodeVisitor):
+    """
+    Iterate over the code to find function calls
+    """
+
+    def __init__(self):
+        self.function_calls = []
+
+    def visit_Call(self, node):
+        if isinstance(node.func, ast.Name):
+            self.function_calls.append(node.func.id)
+        elif isinstance(node.func, ast.Attribute) and isinstance(
+            node.func.value, ast.Name
+        ):
+            self.function_calls.append(f"{node.func.value.id}.{node.func.attr}")
+        self.generic_visit(node)
+
+
 class CodeManager:
     _dfs: List
     _config: Union[Config, dict]
@@ -82,6 +100,7 @@ def __init__(
         self._dfs = dfs
         self._config = config
         self._logger = logger
+        self._function_call_vistor = FunctionCallVisitor()
 
     def _required_dfs(self, code: str) -> List[str]:
         """
@@ -318,14 +337,6 @@ def check_skill_func_def_exists(self, node: ast.AST, context: CodeExecutionConte
             node, ast.FunctionDef
         ) and context.skills_manager.skill_exists(node.name)
 
-    def check_direct_sql_func_usage_exists(self, node: ast.AST):
-        return (
-            self._validate_direct_sql(self._dfs)
-            and isinstance(node.value, ast.Call)
-            and isinstance(node.value.func, ast.Name)
-            and node.value.func.id == "execute_sql_query"
-        )
-
     def _validate_direct_sql(self, dfs: List[BaseConnector]) -> bool:
         """
         Raises error if they don't belong sqlconnector or have different credentials
@@ -384,6 +395,9 @@ def _clean_code(self, code: str, context: CodeExecutionContext) -> str:
         new_body = []
         execute_sql_query_used = False
 
+        # find function calls
+        self._function_call_vistor.visit(tree)
+
         for node in tree.body:
             if isinstance(node, (ast.Import, ast.ImportFrom)):
                 self._check_imports(node)
@@ -405,7 +419,10 @@ def _clean_code(self, code: str, context: CodeExecutionContext) -> str:
                 continue
 
             # if generated code contain execute_sql_query usage
-            if self.check_direct_sql_func_usage_exists(node):
+            if (
+                self._validate_direct_sql(self._dfs)
+                and "execute_sql_query" in self._function_call_vistor.function_calls
+            ):
                 execute_sql_query_used = True
 
             # Sanity for sql query the code should only use allowed tables

diff --git a/pandasai/helpers/query_exec_tracker.py b/pandasai/helpers/query_exec_tracker.py
@@ -92,17 +92,11 @@ def add_step(self, step: dict) -> None:
         Args:
             step (dict): dictionary containing information
         """
-        # Exception step to store serializable output response from the generated code
-        if (
-            "type" in step
-            and step["type"] == "CodeExecution"
-            and step["data"] is not None
-            and step["data"]["content_type"] == "response"
-        ):
-            self._response = step["data"]["value"]
-
         self._steps.append(step)
 
+    def set_final_response(self, response: Any):
+        self._response = response
+
     def execute_func(self, function, *args, **kwargs) -> Any:
         """
         Tracks function executions, calculates execution time and prepare data
@@ -236,11 +230,11 @@ def publish(self) -> None:
         server_url = None
 
         if self._server_config is None:
-            server_url = os.environ.get("PANDASAI_API_URL")
+            server_url = os.environ.get("PANDASAI_API_URL", "https://api.domer.ai")
             api_key = os.environ.get("PANDASAI_API_KEY")
         else:
             server_url = self._server_config.get(
-                "server_url", os.environ.get("PANDASAI_API_URL")
+                "server_url", os.environ.get("PANDASAI_API_URL", "https://api.domer.ai")
             )
             api_key = self._server_config.get(
                 "api_key", os.environ.get("PANDASAI_API_KEY")

diff --git a/pandasai/pipelines/chat/code_execution.py b/pandasai/pipelines/chat/code_execution.py
@@ -105,6 +105,7 @@ def execute(self, input: Any, **kwargs) -> Any:
             True,
             "Code Executed Successfully",
             {"content_type": "response", "value": ResponseSerializer.serialize(result)},
+            final_track_output=True,
         )
 
     def _retry_run_code(

diff --git a/pandasai/pipelines/logic_unit_output.py b/pandasai/pipelines/logic_unit_output.py
@@ -12,15 +12,18 @@ class LogicUnitOutput:
     message: str
     success: bool
     metadata: dict
+    final_track_output: bool
 
     def __init__(
         self,
         output: Any = None,
         success: bool = False,
         message: str = None,
         metadata: dict = None,
+        final_track_output: bool = False,
     ):
         self.output = output
         self.message = message
         self.metadata = metadata
         self.success = success
+        self.final_track_output = final_track_output
diff --git a/pandasai/pipelines/pipeline.py b/pandasai/pipelines/pipeline.py
@@ -119,6 +119,11 @@ def run(self, data: Any = None) -> Any:
                         }
                     )
 
+                    if step_output.final_track_output:
+                        self._query_exec_tracker.set_final_response(
+                            step_output.metadata
+                        )
+
                     data = step_output.output
                 else:
                     data = step_output

diff --git a/tests/unit_tests/test_codemanager.py b/tests/unit_tests/test_codemanager.py
@@ -1,4 +1,5 @@
 """Unit tests for the CodeManager class"""
+
 import ast
 import uuid
 from typing import Optional
@@ -564,6 +565,21 @@ def test_clean_code_with_no_execute_sql_query_usage(
             "For Direct SQL set to true, execute_sql_query function must be used. Generating Error Prompt!!!"
         )
 
+    def test_clean_code_with_no_execute_sql_query_usage_script(
+        self,
+        pgsql_connector: PostgreSQLConnector,
+        exec_context: MagicMock,
+        config_with_direct_sql: Config,
+        logger: Logger,
+    ):
+        """Test that the correct sql table"""
+        code_manager = CodeManager([pgsql_connector], config_with_direct_sql, logger)
+        safe_code = (
+            """orders_count = execute_sql_query('SELECT COUNT(*) FROM orders')[0][0]"""
+        )
+
+        assert code_manager._clean_code(safe_code, exec_context) == safe_code
+
     def test_clean_code_using_incorrect_sql_table(
         self,
         pgsql_connector: PostgreSQLConnector,