-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/prompt to generate tests #16
Draft
sungchun12
wants to merge
22
commits into
main
Choose a base branch
from
feature/prompt-to-generate-tests
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from 21 commits
Commits
Show all changes
22 commits
Select commit
Hold shift + click to select a range
3a69477
first cut
sungchun12 3112cea
full parity
sungchun12 d79883c
remove old file
sungchun12 98409c9
fst start works
sungchun12 c2a0421
Merge branch 'main' of https://github.com/sungchun12/fst into refacto…
sungchun12 1afb04c
pollingobserver refactor
sungchun12 9b6344b
better debounce
sungchun12 aa5fee6
simple fst start only
sungchun12 d8f3db0
cleaner imports
sungchun12 bec6391
add path option
sungchun12 00e0542
rename file and imports
sungchun12 1a809ab
less words
sungchun12 a452ee8
remove unused stuff
sungchun12 9b3ad3e
robust test runs
sungchun12 353b0a7
robust path flagging
sungchun12 ca1dddf
remove buffer
sungchun12 21706c3
update logs
sungchun12 bafdcd6
Add note on running dbt build to README
xtomflo a0e3933
Add test finding method and add prompt to ask if tests should be gene…
xtomflo 8ad5666
Add Flag setting for running tests
xtomflo a141f18
Fix duplicating .yml definitions for models. When schema.yml has des…
xtomflo 0286c1b
Add model name getter
xtomflo File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
import os | ||
import yaml | ||
|
||
CURRENT_WORKING_DIR = os.getcwd() | ||
DISABLE_TESTS = False | ||
|
||
# Load profiles.yml only once | ||
profiles_path = os.path.join(CURRENT_WORKING_DIR, "profiles.yml") | ||
with open(profiles_path, "r") as file: | ||
PROFILES = yaml.safe_load(file) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
import duckdb | ||
import os | ||
from functools import lru_cache | ||
from fst.config_defaults import PROFILES | ||
import logging | ||
|
||
logger = logging.getLogger(__name__) | ||
|
||
@lru_cache | ||
def execute_query(query: str, db_file: str): | ||
connection = duckdb.connect(database=db_file, read_only=False) | ||
result = connection.execute(query).fetchmany(5) | ||
column_names = [desc[0] for desc in connection.description] | ||
connection.close() | ||
return result, column_names | ||
|
||
@lru_cache | ||
def get_duckdb_file_path(): | ||
target = PROFILES["jaffle_shop"]["target"] | ||
db_path = PROFILES["jaffle_shop"]["outputs"][target]["path"] | ||
return db_path | ||
|
||
|
||
@lru_cache | ||
def get_project_name(): | ||
project_name = list(PROFILES.keys())[0] | ||
logger.info(f"project_name: {project_name}") | ||
return project_name |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
import os | ||
import time | ||
from watchdog.observers.polling import PollingObserver | ||
import logging | ||
|
||
logger = logging.getLogger(__name__) | ||
|
||
observer = None | ||
|
||
def watch_directory(event_handler, file_path: str, ): | ||
global observer | ||
logger.info(f"Started watching directory: {file_path}") | ||
observer = PollingObserver() | ||
observer.schedule(event_handler, path=file_path, recursive=False) | ||
observer.start() | ||
|
||
try: | ||
while True: | ||
time.sleep(1) | ||
except KeyboardInterrupt: | ||
observer.stop() | ||
observer.join() | ||
logger.info(f"Stopped watching directory: {file_path}") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,144 @@ | ||
import os | ||
import yaml | ||
import logging | ||
import re | ||
from fst.config_defaults import CURRENT_WORKING_DIR | ||
from fst.db_utils import get_project_name | ||
|
||
logger = logging.getLogger(__name__) | ||
|
||
def get_active_file(file_path: str): | ||
if file_path and file_path.endswith(".sql"): | ||
return file_path | ||
else: | ||
logger.warning("No active SQL file found.") | ||
return None | ||
|
||
def find_compiled_sql_file(file_path): | ||
active_file = get_active_file(file_path) | ||
if not active_file: | ||
return None | ||
project_directory = CURRENT_WORKING_DIR | ||
project_name = get_project_name() | ||
relative_file_path = os.path.relpath(active_file, project_directory) | ||
compiled_directory = os.path.join( | ||
project_directory, "target", "compiled", project_name | ||
) | ||
compiled_file_path = os.path.join(compiled_directory, relative_file_path) | ||
return compiled_file_path if os.path.exists(compiled_file_path) else None | ||
|
||
def find_tests_for_model(model_name, directory='models'): | ||
""" | ||
Check if tests are generated for a given model in a dbt project. | ||
|
||
Args: | ||
model_name (str): The name of the model to search for tests. | ||
directory (str, optional): The root directory to start the search. Defaults to 'models'. | ||
|
||
Returns: | ||
dict: A dictionary containing information about the tests found, including the model name, column name, file type, and tests. | ||
""" | ||
tests_data = {} | ||
|
||
for root, _, files in os.walk(directory): | ||
for file in files: | ||
if file.endswith(('.schema.yml', '.yml')): | ||
filepath = os.path.join(root, file) | ||
with open(filepath, 'r') as f: | ||
schema_data = yaml.safe_load(f) | ||
|
||
for model in schema_data.get('models', []): | ||
if model['name'] == model_name: | ||
columns = model.get('columns', {}) | ||
for column_data in columns: | ||
column_name = column_data['name'] | ||
tests = column_data.get('tests', []) | ||
if tests: | ||
tests_data.append({'file': filepath, 'column': column_name, 'tests': tests}) | ||
|
||
return tests_data | ||
|
||
def get_model_name_from_file(file_path: str): | ||
project_directory = CURRENT_WORKING_DIR | ||
models_directory = os.path.join(project_directory, "models") | ||
relative_file_path = os.path.relpath(file_path, models_directory) | ||
model_name, _ = os.path.splitext(relative_file_path) | ||
return model_name.replace(os.sep, ".") | ||
|
||
import yaml | ||
import re | ||
import os | ||
|
||
def generate_test_yaml(model_name, column_names, active_file_path, tests_data): | ||
yaml_files = {} | ||
|
||
for column in column_names: | ||
tests_to_add = [] | ||
if re.search(r"(_id|_ID)$", column): | ||
tests_to_add = ["unique", "not_null"] | ||
|
||
# Check if tests for this column already exist | ||
existing_tests = [data for data in tests_data if data['column'] == column] | ||
|
||
if existing_tests: | ||
# Update the existing YAML file with new tests | ||
for test_data in existing_tests: | ||
yaml_file = test_data['file'] | ||
if yaml_file not in yaml_files: | ||
with open(yaml_file, 'r') as f: | ||
yaml_files[yaml_file] = yaml.safe_load(f) | ||
|
||
models = yaml_files[yaml_file].get('models', []) | ||
for model in models: | ||
if model['name'] == model_name: | ||
columns = model.get('columns', []) | ||
for existing_column in columns: | ||
if existing_column['name'] == column: | ||
tests = existing_column.get('tests', []) | ||
for test in tests_to_add: | ||
if test not in tests: | ||
tests.append(test) | ||
existing_column['tests'] = tests | ||
else: | ||
# If no tests exist, add the tests to the schema.yml file | ||
schema_yml_path = os.path.join(os.path.dirname(active_file_path), "schema.yml") | ||
if os.path.exists(schema_yml_path): | ||
with open(schema_yml_path, "r") as f: | ||
schema_yml_data = yaml.safe_load(f) | ||
|
||
for model in schema_yml_data.get("models", []): | ||
if model["name"] == model_name: | ||
if "columns" not in model: | ||
model["columns"] = [] | ||
|
||
new_column = { | ||
"name": column, | ||
"description": f"A placeholder description for {column}", | ||
"tests": tests_to_add, | ||
} | ||
model["columns"].append(new_column) | ||
break | ||
|
||
with open(schema_yml_path, "w") as f: | ||
yaml.dump(schema_yml_data, f) | ||
|
||
return schema_yml_path | ||
|
||
# Return the first file path where tests were found | ||
return next(iter(yaml_files)) | ||
|
||
|
||
def get_model_paths(): | ||
with open("dbt_project.yml", "r") as file: | ||
dbt_project = yaml.safe_load(file) | ||
model_paths = dbt_project.get("model-paths", []) | ||
return [ | ||
os.path.join(os.getcwd(), path) for path in model_paths | ||
] | ||
|
||
def get_models_directory(project_dir): | ||
dbt_project_file = os.path.join(project_dir, 'dbt_project.yml') | ||
with open(dbt_project_file, 'r') as file: | ||
dbt_project = yaml.safe_load(file) | ||
models_subdir = dbt_project.get('model-paths')[0] | ||
return os.path.join(project_dir, models_subdir) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are more efficient ways to do this. Example code from GPT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a good idea to separate the methods indeed!
I've been a bit weary of overusing GPTs code directly as it tends to be spaghetti complicated, but I think we'll always be able to handle when there's issues