Skip to content

Commit

Permalink
Further adjust the prompts #1
Browse files Browse the repository at this point in the history
  • Loading branch information
pramitchoudhary committed May 18, 2023
1 parent 7b4abc5 commit 6ffd616
Showing 1 changed file with 21 additions and 23 deletions.
44 changes: 21 additions & 23 deletions sidekick/configs/prompt_template.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
TASK_PROMPT = {
"system_prompt": "Act as a Data Analyst",
"user_prompt": """
### For Table: {_table_name} Given an input *Question*, only return specific and informative tasks as an ordered itemized list for SQL generation that answer the question.
Extract all of the proper nouns (generally capitalized, abbreviated) from the Samples section and add to Context section as Key, Value pair.
Use the Context section and Samples section to establish relationship when tokens from Question does not match column names.
If information is not found in Context or Samples section, attempt to reason for possible tasks but also ask questions for.
Infer the return type of the Question. Do not generate final SQL response, only return tasks.
# Data information: \n{_data_info}
# Samples: \n{_sample_queries}
# Context: {_context}
# *Question*: {_question_str};
# Output: Tasks: ordered list of tasks
### Given an input *Question*, only return specific and informative tasks as an ordered numeric list for SQL generation that answer the question.
Use the *History* and *Context* section for co-reference and to infer relationships.
If the words in the *Question* do not match column names *Data* section; Search for them in *Context* section.
Always use *Context* with highest similarity score with the *Question*.
If no information related to the *Question* is found; attempt to predict and reason for possible tasks.
Infer the return type of the Question. Do not generate final complete SQL response, only return tasks.
# *Data:* \nFor table {_table_name} schema info is mentioned below,\n{_data_info}
# *History*: \n{_sample_queries}
# *Question*: For Table {_table_name}, {_question_str}, *Context*: {_context}
# Output: Tasks: ordered numeric list of tasks
""",
}

Expand All @@ -21,23 +21,21 @@
# Reference: https://arxiv.org/pdf/2005.14165.pdf
QUERY_PROMPT = """
### System: Act as a SQL Expert
# Given an input *Question*, only generate syntactically correct SQL queries using step by step reasoning from Tasks section.
# Extract all of the proper nouns (generally capitalized, abbreviated) from the Examples section and add to Context section as Key, Value pair.
# Use the context section to establish relationship when tokens from Question does not match column names.
# Given an input *Question*, only generate syntactically correct SQL queries using step by step reasoning from *Tasks* section.
# Pick the SQL query which has the highest average log probability of explaining the
candidate question.
candidate *Question*.
### {dialect} SQL tables
Examples:\n{_sample_queries}
### *Question*: {_question};
### *History*:\n{_sample_queries}
### *Question*: {_question}
# SELECT 1
### Tasks:\n{_tasks}
### Context: {_context}
### Suggestions:
# Don't use aggregate and window function together;
# Avoid COUNT(*) and prefer COUNT(1);
### *Tasks*:\n{_tasks}
### *Policies for SQL generation*:
# Avoid overly complex SQL queries
# Don't use aggregate and window function together
# Use COUNT(1) instead of COUNT(*)
# Return with LIMIT 100
# Prefer NOT EXISTS to LEFT JOIN ON null id;
# Avoid using the WITH statement;
# Prefer NOT EXISTS to LEFT JOIN ON null id
# Avoid using the WITH statement
# When using DESC keep NULLs at the end
# If JSONB format found in Table schema, do pattern matching on keywords from the question
# Add explanation
Expand Down

0 comments on commit 6ffd616

Please sign in to comment.