Skip to content

Commit

Permalink
fix FE and move prompts to text file
Browse files Browse the repository at this point in the history
  • Loading branch information
noah-paige committed Oct 24, 2024
1 parent 9bca372 commit f6f12c1
Show file tree
Hide file tree
Showing 13 changed files with 125 additions and 156 deletions.
21 changes: 12 additions & 9 deletions backend/dataall/modules/worksheets/aws/bedrock_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,11 @@
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import PromptTemplate
from dataall.base.db import exceptions
from dataall.modules.worksheets.aws.bedrock_prompts import (
SQL_EXAMPLES,
TEXT_TO_SQL_PROMPT_TEMPLATE,
PROCESS_TEXT_PROMPT_TEMPLATE,
)
import os

TEXT_TO_SQL_EXAMPLES_PATH = os.path.join(os.path.dirname(__file__), 'bedrock_prompts', 'text_to_sql_examples.txt')
TEXT_TO_SQL_TEMPLATE_PATH = os.path.join(os.path.dirname(__file__), 'bedrock_prompts', 'test_to_sql_template.txt')
PROCESS_TEXT_TEMPLATE_PATH = os.path.join(os.path.dirname(__file__), 'bedrock_prompts', 'process_text_template.txt')


class BedrockClient:
Expand All @@ -29,16 +29,19 @@ def __init__(self):
)

def invoke_model_text_to_sql(self, prompt: str, metadata: str):
prompt_template = PromptTemplate.from_template(TEXT_TO_SQL_PROMPT_TEMPLATE)

prompt_template = PromptTemplate.from_file(TEXT_TO_SQL_TEMPLATE_PATH)
chain = prompt_template | self._model | StrOutputParser()
response = chain.invoke({'prompt': prompt, 'context': metadata, 'examples': SQL_EXAMPLES})

with open(TEXT_TO_SQL_EXAMPLES_PATH, 'r') as f:
examples = f.read()

response = chain.invoke({'prompt': prompt, 'context': metadata, 'examples': examples})
if response.startswith('Error:'):
raise exceptions.ModelGuardrailException(response)
return response

def invoke_model_process_text(self, prompt: str, content: str):
prompt_template = PromptTemplate.from_template(PROCESS_TEXT_PROMPT_TEMPLATE)
prompt_template = PromptTemplate.from_file(PROCESS_TEXT_TEMPLATE_PATH)

chain = prompt_template | self._model | StrOutputParser()
response = chain.invoke({'prompt': prompt, 'content': content})
Expand Down
104 changes: 0 additions & 104 deletions backend/dataall/modules/worksheets/aws/bedrock_prompts.py

This file was deleted.

Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
You are an AI assistant tasked with analyzing and processing text content. Your goal is to provide accurate and helpful responses based on the given content and user prompt.
You must follow the steps:

1. Detetermine if the document has the information to be able to answer the question. If not respond with "Error: The Document does not provide the information needed to answer you question"
2. I want you to answer the question based on the information in the document.
3. At the bottom I want you to provide the sources (the parts of the document where you found the results). The sources should be listed in order


Content to analyze:
{content}

User prompt: {prompt}

Please provide a response that addresses the user's prompt in the context of the given content. Be thorough, accurate, and helpful in your analysis.
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
You will be given the name of an AWS Glue Database, metadata from one or more AWS Glue Table(s) and a user prompt from a user.

Based on this information your job is to turn the prompt into a SQL query that will be sent to query the data within the tables in Amazon Athena.

Take the following points into consideration. It is crucial that you follow them:

- I only want you to return the SQL needed (NO EXPLANATION or anything else).

- Tables are referenced on the following form 'database_name.table_name' (for example 'Select * FROM database_name.table_name ...' and not 'SELECT * FROM table_name ...) since we dont have access to the table name directly since its not global variable.

- Take relations between tables into consideration, for example if you have a table with columns that might reference the other tables, you would need to join them in the query.

- The generate SQL statement MUST be Read only (no WRITE, INSERT, ALTER or DELETE keywords)

- Answer on the same form as the examples given below.

Examples:
{examples}


I want you to follow the following steps when generating the SQL statement:

Step 1: Determine if the given tables columns are suitable to answer the question.
If not respond with "Error: The tables provided does not give enough information"

Step 2: Determine if the user wants to perform any mutations, if so return "Error: Only READ queries are allowed"

Step 3: Determine if joins will be needed.

Step 4: Generate the SQL in order to solve the problem.


Based on the following metadata:
{context}


User prompt: {prompt}

Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
Example 1.
User prompt: I want to get the average area of all listings

Context: Based on on the following metadata
Database Name : dataall_homes_11p3uu8f
Table Name: listings
Column Metadata: [{'Name': 'price', 'Type': 'bigint'}, {'Name': 'area', 'Type': 'bigint'}, {'Name': 'bedrooms', 'Type': 'bigint'}, {'Name': 'bathrooms', 'Type': 'bigint'}, {'Name': 'stories', 'Type': 'bigint'}, {'Name': 'mainroad', 'Type': 'string'}, {'Name': 'guestroom', 'Type': 'string'}, {'Name': 'basement', 'Type': 'string'}, {'Name': 'hotwaterheating', 'Type': 'string'}, {'Name': 'airconditioning', 'Type': 'string'}, {'Name': 'parking', 'Type': 'bigint'}, {'Name': 'prefarea', 'Type': 'string'}, {'Name': 'furnishingstatus', 'Type': 'string'}, {'Name': 'passengerid', 'Type': 'bigint'}, {'Name': 'survived', 'Type': 'bigint'}, {'Name': 'pclass', 'Type': 'bigint'}, {'Name': 'name', 'Type': 'string'}, {'Name': 'sex', 'Type': 'string'}, {'Name': 'age', 'Type': 'double'}, {'Name': 'sibsp', 'Type': 'bigint'}, {'Name': 'parch', 'Type': 'bigint'}, {'Name': 'ticket', 'Type': 'string'}, {'Name': 'fare', 'Type': 'double'}, {'Name': 'cabin', 'Type': 'string'}, {'Name': 'embarked', 'Type': 'string'}]
Partition Metadata: []

Response: SELECT AVG(CAST(area AS DOUBLE)) FROM dataall_homes_11p3uu8f.listings WHERE area IS NOT NULL;


Example 2.
User prompt: I want to get the average of the 3 most expensive listings with less than 3 bedrooms

Context: Based on on the following metadata
Database Name : dataall_homes_11p3uu8f
Table Name: listings
Column Metadata: [{'Name': 'price', 'Type': 'bigint'}, {'Name': 'area', 'Type': 'bigint'}, {'Name': 'bedrooms', 'Type': 'bigint'}, {'Name': 'bathrooms', 'Type': 'bigint'}, {'Name': 'stories', 'Type': 'bigint'}, {'Name': 'mainroad', 'Type': 'string'}, {'Name': 'guestroom', 'Type': 'string'}, {'Name': 'basement', 'Type': 'string'}, {'Name': 'hotwaterheating', 'Type': 'string'}, {'Name': 'airconditioning', 'Type': 'string'}, {'Name': 'parking', 'Type': 'bigint'}, {'Name': 'prefarea', 'Type': 'string'}, {'Name': 'furnishingstatus', 'Type': 'string'}, {'Name': 'passengerid', 'Type': 'bigint'}, {'Name': 'survived', 'Type': 'bigint'}, {'Name': 'pclass', 'Type': 'bigint'}, {'Name': 'name', 'Type': 'string'}, {'Name': 'sex', 'Type': 'string'}, {'Name': 'age', 'Type': 'double'}, {'Name': 'sibsp', 'Type': 'bigint'}, {'Name': 'parch', 'Type': 'bigint'}, {'Name': 'ticket', 'Type': 'string'}, {'Name': 'fare', 'Type': 'double'}, {'Name': 'cabin', 'Type': 'string'}, {'Name': 'embarked', 'Type': 'string'}]
Partition Metadata: []

Response: SELECT AVG(price) AS average_price FROM (SELECT price FROM dataall_homes_11p3uu8f.listings WHERE bedrooms > 3 ORDER BY price DESC LIMIT 3);


Example 3.
User prompt: I want to see if any letter has been sent from 900 Somerville Avenue to 2 Finnigan Street and what is the content

Context: Based on the following metadata
Database Name : dataall_packages_omf768qq
Table name: packages
Column Metadata: [{'Name': 'id', 'Type': 'bigint'}, {'Name': 'contents', 'Type': 'string'}, {'Name': 'from_address_id', 'Type': 'bigint'}, {'Name': 'to_address_id', 'Type': 'bigint'}]\n
Partition Metadata: []

Database Name : dataall_packages_omf768qq
Table name: addresses
Column Metadata: [{'Name': 'id', 'Type': 'bigint'}, {'Name': 'address', 'Type': 'string'}, {'Name': 'type', 'Type': 'string'}]
Partition Metadata: []

Database Name : dataall_packages_omf768qq
Table name: drivers
Column Metadata: [{'Name': 'id', 'Type': 'bigint'}, {'Name': 'name', 'Type': 'string'}]
Partition Metadata: []

Database Name : dataall_packages_omf768qq
Table name: scans
Column Metadata: [{'Name': 'id', 'Type': 'bigint'}, {'Name': 'driver_id', 'Type': 'bigint'}, {'Name': 'package_id', 'Type': 'bigint'}, {'Name': 'address_id', 'Type': 'bigint'}, {'Name': 'action', 'Type': 'string'}, {'Name': 'timestamp', 'Type': 'string'}]
Partition Metadata: []

Response: SELECT p.contents FROM dataall_packages_omf768qq.packages p JOIN dataall_packages_omf768qq.addresses a1 ON p.from_address_id = a1.id JOIN dataall_packages_omf768qq.addresses a2 ON p.to_address_id = a2.id WHERE a1.address = '900 Somerville Avenue' AND a2.address = '2 Finnigan Street';
4 changes: 2 additions & 2 deletions backend/dataall/modules/worksheets/aws/glue_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,8 @@ def get_table_metadata(self, database, table_name):
column_metadata = table_metadata['Table']['StorageDescriptor']['Columns']
partition_metadata = table_metadata['Table']['PartitionKeys']
meta_data = f"""
Database name: {database}
Table name: {table_name}
Database Name: {database}
Table Name: {table_name}
Column Metadata: {column_metadata}
Partition Metadata: {partition_metadata}
"""
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,8 @@ import {
Typography
} from '@mui/material';
import PropTypes from 'prop-types';
import { Label } from 'design';
import { Label, UserModal } from 'design';
import { isFeatureEnabled } from 'utils';
import { UserModal } from 'design';
import { useState } from 'react';

export const DatasetGovernance = (props) => {
Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
import React, { useState } from 'react';
import { Box, Grid } from '@mui/material';
import PropTypes from 'prop-types';
import { ObjectBrief, ObjectMetadata } from 'design';
import { UserModal } from 'design';
import { ObjectBrief, ObjectMetadata, UserModal } from 'design';
import { EnvironmentConsoleAccess } from './EnvironmentConsoleAccess';
import { EnvironmentFeatures } from './EnvironmentFeatures';

Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
import React, { useState } from 'react';
import { Box, Grid } from '@mui/material';
import PropTypes from 'prop-types';
import { ObjectBrief, ObjectMetadata } from 'design';
import { UserModal } from 'design';
import { ObjectBrief, ObjectMetadata, UserModal } from 'design';

export const OrganizationOverview = (props) => {
const { organization, ...other } = props;
Expand Down
33 changes: 0 additions & 33 deletions frontend/src/modules/Worksheets/components/TextDisplay.js

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -210,6 +210,12 @@ export const WorksheetTextToSQLEditor = ({
</Box>
<Box sx={{ p: 2 }}>
<LoadingButton
disabled={
!currentEnv ||
!selectedDatabase ||
!selectedTables.length ||
!prompt
}
loading={invoking}
variant="contained"
onClick={handleSubmit}
Expand Down
1 change: 0 additions & 1 deletion frontend/src/modules/Worksheets/components/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@ export * from './SQLQueryEditor';
export * from './WorksheetEditFormModal';
export * from './WorksheetListItem';
export * from './WorksheetResult';
export * from './TextDisplay';
export * from './WorksheetSQLEditor';
export * from './WorksheetTextToSQLEditor';
export * from './WorksheetDocAnalyzer';
2 changes: 1 addition & 1 deletion frontend/src/modules/Worksheets/views/WorksheetView.js
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ const tabs = [
active: true
},
{
label: 'TextToSQL Editor',
label: 'Text-To-SQL Editor',
value: 'TextToSQL',
active: isFeatureEnabled('worksheets', 'nlq')
},
Expand Down

0 comments on commit f6f12c1

Please sign in to comment.