You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on May 17, 2024. It is now read-only.
Describe the bug
I'm trying to get a json output from a --dbt run which uses a state file. It works fine if there is no --json flag. But when I add the json flag, it gets stuck and process never finishes.
Make sure to include the following (minus sensitive information):
The command or code you used
sh data-diff --dbt --state prod-run-artifacts/manifest.json --json -d
The run output + error you're getting. (including tracestack)
Running with data-diff=0.11.1
15:44:30 INFO Parsing file dbt_project.yml dbt_parser.py:287
INFO Parsing file /dbt_project/target/manifest.json dbt_parser.py:280
INFO Parsing file prod-run-artifacts/manifest.json dbt_parser.py:280
INFO Parsing file target/run_results.json dbt_parser.py:253
INFO config: prod_database=None prod_schema=None prod_custom_schema=None datasource_id=None dbt_parser.py:159
INFO Parsing file /dbt_project/profiles.yml dbt_parser.py:294
DEBUG Found no PKs dbt_parser.py:465
{"status": "failed", "model": "model.dbt.bi_dagster_asset", "dataset1": ["data-prod", "prod_observability", "bi_dagster_asset"], "dataset2": ["data-prod", "dbt_pr_test_ci_observability", "bi_dagster_asset"], "error": "No primary key found. Add uniqueness tests, meta, or tags.", "version": "1.0.0"}
DEBUG Found PKs via Uniqueness tests [fct_tbl_info]: {'col_id'} dbt_parser.py:459
DEBUG Found PKs via Uniqueness tests [int_table]: {'col_id'} dbt_parser.py:459
DEBUG Found no PKs dbt_parser.py:465
{"status": "failed", "model": "model.dbt.dim_latest_email_table", "dataset1": ["data-prod", "prod_schema", "dim_latest_email_table"], "dataset2": ["data-prod", "dbt_pr_test_ci_schema", "dim_latest_email_table"], "error": "No primary key found. Add uniqueness tests, meta, or tags.", "version": "1.0.0"}
DEBUG Database 'BigQuery(default_schema='dev', _interactive=False, is_closed=False, _dialect=Dialect(_prevent_overflow_when_concat=False), project='data-dev', dataset='dev', _client=<google.cloud.bigquery.client.Client object at 0x10xxxx>)' does not allow setting timezone. We recommend making sure it's set to 'UTC'. _connect.py:300 DEBUG Database 'BigQuery(default_schema='dev', _interactive=False, is_closed=False, _dialect=Dialect(_prevent_overflow_when_concat=False), project='data-dev', dataset='dev', _client=<google.cloud.bigquery.client.Client object at 0x12xxxx>)' does not allow setting timezone. We recommend making sure it's set to 'UTC'. _connect.py:300
DEBUG Running SQL (BigQuery): ('data-prod', 'prod_schema', 'fct_tbl_info') base.py:980
SELECT column_name, data_type, 6 as datetime_precision, 38 as numeric_precision, 9 as numeric_scale FROM `data-prod`.`prod_schema`.INFORMATION_SCHEMA.COLUMNS WHERE table_name = 'fct_tbl_info' AND table_schema = 'prod_schema'
DEBUG Running SQL (BigQuery): ('data-prod', 'prod_schema', 'int_table') base.py:980
SELECT column_name, data_type, 6 as datetime_precision, 38 as numeric_precision, 9 as numeric_scale FROM `data-prod`.`prod_schema`.INFORMATION_SCHEMA.COLUMNS WHERE table_name = 'int_table' AND table_schema = 'prod_schema'
15:44:32 DEBUG Running SQL (BigQuery): ('data-prod', 'dbt_pr_test_ci_schema', 'int_table') base.py:980
SELECT column_name, data_type, 6 as datetime_precision, 38 as numeric_precision, 9 as numeric_scale FROM `data-prod`.`dbt_pr_test_ci_schema`.INFORMATION_SCHEMA.COLUMNS WHERE table_name = 'int_table' AND table_schema = 'dbt_pr_test_ci_schema'
DEBUG Running SQL (BigQuery): ('data-prod', 'dbt_pr_test_ci_schema', 'fct_tbl_info') base.py:980
SELECT column_name, data_type, 6 as datetime_precision, 38 as numeric_precision, 9 as numeric_scale FROM `data-prod`.`dbt_pr_test_ci_schema`.INFORMATION_SCHEMA.COLUMNS WHERE table_name = 'fct_tbl_info' AND table_schema = 'dbt_pr_test_ci_schema'
15:44:33 DEBUG Running SQL (BigQuery): ('data-prod', 'prod_schema', 'int_table') base.py:980
SELECT column_name, data_type, 6 as datetime_precision, 38 as numeric_precision, 9 as numeric_scale FROM `data-prod`.`prod_schema`.INFORMATION_SCHEMA.COLUMNS WHERE table_name = 'int_table' AND table_schema = 'prod_schema'
DEBUG Running SQL (BigQuery): ('data-prod', 'prod_schema', 'fct_tbl_info') base.py:980
SELECT column_name, data_type, 6 as datetime_precision, 38 as numeric_precision, 9 as numeric_scale FROM `data-prod`.`prod_schema`.INFORMATION_SCHEMA.COLUMNS WHERE table_name = 'fct_tbl_info' AND table_schema = 'prod_schema'
DEBUG Running SQL (BigQuery): ('data-prod', 'prod_schema', 'int_table') base.py:980
SELECT * FROM (SELECT TRIM(`sf_id`), TRIM(`col_name`), TRIM(`col_type`), TRIM(`col_mtd`), TRIM(`col_pl`) FROM `data-prod`.`prod_schema`.`int_table`) AS LIMITED_SELECT LIMIT 64
15:44:34 DEBUG Running SQL (BigQuery): ('data-prod', 'prod_schema', 'fct_tbl_info') base.py:980
SELECT * FROM (SELECT TRIM(`unit`) FROM `data-prod`.`prod_schema`.`fct_tbl_info`) AS LIMITED_SELECT LIMIT 64
..... Cut because text gets too long ....
DEBUG Done collecting stats for table #2: ('data-prod', 'dbt_pr_test_ci_schema', 'int_table') joindiff_tables.py:306
DEBUG Testing for null keys: ('data-prod', 'prod_schema', 'int_table') <> ('data-prod', 'dbt_pr_test_ci_schema', 'int_table') joindiff_tables.py:252
DEBUG Running SQL (BigQuery): ('data-prod', 'prod_schema', 'int_table') base.py:980
SELECT `col_id` FROM `data-prod`.`prod_schema`.`int_table` WHERE (`col_id` IS NULL)
DEBUG Done collecting stats for table #2: ('data-prod', 'dbt_pr_test_ci_schema', 'fct_tbl_info') joindiff_tables.py:306
DEBUG Testing for null keys: ('data-prod', 'prod_schema', 'fct_tbl_info') <> ('data-prod', 'dbt_pr_test_ci_schema', 'fct_tbl_info') joindiff_tables.py:252
DEBUG Running SQL (BigQuery): ('data-prod', 'prod_schema', 'fct_tbl_info') base.py:980
SELECT `col_id` FROM `data-prod`.`prod_schema`.`fct_tbl_info` WHERE (`col_id` IS NULL)
15:44:38 DEBUG Running SQL (BigQuery): ('data-prod', 'dbt_pr_test_ci_schema', 'int_table') base.py:980
SELECT `col_id` FROM `data-prod`.`dbt_pr_test_ci_schema`.`int_table` WHERE (`col_id` IS NULL)
DEBUG Running SQL (BigQuery): ('data-prod', 'dbt_pr_test_ci_schema', 'fct_tbl_info') base.py:980
SELECT `col_id` FROM `data-prod`.`dbt_pr_test_ci_schema`.`fct_tbl_info` WHERE (`col_id` IS NULL)
15:44:39 DEBUG Counting exclusive rows: ('data-prod', 'prod_schema', 'int_table') <> ('data-prod', 'dbt_pr_test_ci_schema', 'int_table') joindiff_tables.py:372
DEBUG Running SQL (BigQuery): ('data-prod', 'prod_schema', 'int_table') <> ('data-prod', 'dbt_pr_test_ci_schema', 'int_table') base.py:980
SELECT count(*) FROM (SELECT * FROM (SELECT (`tmp2`.`col_id` IS NULL) AS `is_exclusive_a`, (`tmp1`.`col_id` IS NULL) AS `is_exclusive_b`, CASE WHEN `tmp1`.`col_id` is distinct from `tmp2`.`col_id` THEN 1 ELSE 0 END AS `is_diff_col_id`, CASE WHEN `tmp1`.`col_value` is distinct from `tmp2`.`col_value` THEN 1 ELSE 0 END AS `is_diff_col_value`, CASE WHEN `tmp1`.`org_id` is distinct from `tmp2`.`org_id` THEN 1 ELSE 0
END AS `is_diff_org_id`, CASE WHEN `tmp1`.`col_combined_value` is distinct from `tmp2`.`col_combined_value` THEN 1 ELSE 0 END AS `is_diff_col_combined_value`, CASE WHEN `tmp1`.`col_mtd` is distinct from `tmp2`.`col_mtd` THEN 1 ELSE 0 END AS `is_diff_col_mtd`, CASE WHEN `tmp1`.`sf_id` is distinct from `tmp2`.`sf_id` THEN 1 ELSE 0 END AS `is_diff_sf_id`, CASE WHEN
`tmp1`.`col_ch_prob` is distinct from `tmp2`.`col_ch_prob` THEN 1 ELSE 0 END AS `is_diff_col_ch_prob`, CASE WHEN `tmp1`.`col_name` is distinct from `tmp2`.`col_name` THEN 1 ELSE 0 END AS `is_diff_col_name`, CASE WHEN `tmp1`.`col_pl` is distinct from `tmp2`.`col_pl` THEN 1 ELSE 0 END AS `is_diff_col_pl`, CASE WHEN `tmp1`.`col_type` is distinct from `tmp2`.`col_type` THEN 1 ELSE 0 END AS
`is_diff_col_type`, cast(`tmp1`.`col_id` as string) AS `col_id_a`, cast(`tmp2`.`col_id` as string) AS `col_id_b`, format('%.11f', `tmp1`.`col_value`) AS `col_value_a`, format('%.11f', `tmp2`.`col_value`) AS `col_value_b`, cast(`tmp1`.`org_id` as string) AS `org_id_a`, cast(`tmp2`.`org_id` as string) AS `org_id_b`, format('%.11f', `tmp1`.`col_combined_value`) AS `col_combined_value_a`,
format('%.11f', `tmp2`.`col_combined_value`) AS `col_combined_value_b`, cast(`tmp1`.`col_mtd` as string) AS `col_mtd_a`, cast(`tmp2`.`col_mtd` as string) AS `col_mtd_b`, cast(`tmp1`.`sf_id` as string) AS `sf_id_a`, cast(`tmp2`.`sf_id` as string) AS `sf_id_b`, format('%.11f', `tmp1`.`col_ch_prob`) AS `col_ch_prob_a`, format('%.11f', `tmp2`.`col_ch_prob`) AS
`col_ch_prob_b`, cast(`tmp1`.`col_name` as string) AS `col_name_a`, cast(`tmp2`.`col_name` as string) AS `col_name_b`, cast(`tmp1`.`col_pl` as string) AS `col_pl_a`, cast(`tmp2`.`col_pl` as string) AS `col_pl_b`, cast(`tmp1`.`col_type` as string) AS `col_type_a`, cast(`tmp2`.`col_type` as string) AS `col_type_b` FROM
`data-prod`.`prod_schema`.`int_table``tmp1` FULL OUTER JOIN `data-prod`.`dbt_pr_test_ci_schema`.`int_table``tmp2` ON (`tmp1`.`col_id` = `tmp2`.`col_id`)) tmp3 WHERE ((`is_diff_col_id` =1) OR (`is_diff_col_value` =1) OR (`is_diff_org_id` =1) OR (`is_diff_col_combined_value` =1) OR (`is_diff_col_mtd` =1) OR (`is_diff_sf_id` = 1) OR (`is_diff_col_ch_prob` = 1) OR (`is_diff_col_name` = 1) OR (`is_diff_col_pl` = 1) OR (`is_diff_col_type` = 1)) AND (`is_exclusive_a` OR `is_exclusive_b`)) tmp4
DEBUG Counting exclusive rows: ('data-prod', 'prod_schema', 'fct_tbl_info') <> ('data-prod', 'dbt_pr_test_ci_schema', 'fct_tbl_info') joindiff_tables.py:372
DEBUG Running SQL (BigQuery): ('data-prod', 'prod_schema', 'fct_tbl_info') <> ('data-prod', 'dbt_pr_test_ci_schema', 'fct_tbl_info') base.py:980
SELECT count(*) FROM (SELECT * FROM (SELECT (`tmp2`.`col_id` IS NULL) AS `is_exclusive_a`, (`tmp1`.`col_id` IS NULL) AS `is_exclusive_b`, CASE WHEN `tmp1`.`col_id` is distinct from `tmp2`.`col_id` THEN 1 ELSE 0 END AS `is_diff_col_id`, CASE WHEN `tmp1`.`org_id` is distinct from `tmp2`.`org_id` THEN 1 ELSE 0 END AS `is_diff_org_id`, CASE WHEN `tmp1`.`is_error` is distinct from `tmp2`.`is_error` THEN 1 ELSE 0 END AS
`is_diff_is_error`, CASE WHEN `tmp1`.`is_inc_error` is distinct from `tmp2`.`is_inc_error` THEN 1 ELSE 0 END AS `is_diff_is_inc_error`, CASE WHEN `tmp1`.`is_non_inc_error` is distinct from `tmp2`.`is_non_inc_error` THEN 1 ELSE 0 END AS `is_diff_is_non_inc_error`, CASE WHEN `tmp1`.`created_at` is distinct from `tmp2`.`created_at` THEN 1 ELSE 0 END AS `is_diff_created_at`, CASE WHEN `tmp1`.`is_x` is distinct from
`tmp2`.`is_x` THEN 1 ELSE 0 END AS `is_diff_is_x`, CASE WHEN `tmp1`.`is_inc` is distinct from `tmp2`.`is_inc` THEN 1 ELSE 0 END AS `is_diff_is_inc`, CASE WHEN `tmp1`.`is_x_error` is distinct from `tmp2`.`is_x_error` THEN 1 ELSE 0 END AS `is_diff_is_x_error`, CASE WHEN `tmp1`.`unit` is distinct from `tmp2`.`unit` THEN 1 ELSE 0 END AS `is_diff_unit`, CASE WHEN
`tmp1`.`is_missing_t` is distinct from `tmp2`.`is_missing_t` THEN 1 ELSE 0 END AS `is_diff_is_missing_t`, cast(`tmp1`.`col_id` as string) AS `col_id_a`, cast(`tmp2`.`col_id` as string) AS `col_id_b`, cast(`tmp1`.`org_id` as string) AS `org_id_a`, cast(`tmp2`.`org_id` as string) AS `org_id_b`, cast(cast(`tmp1`.`is_error` as int) as string) AS `is_error_a`, cast(cast(`tmp2`.`is_error` as
int) as string) AS `is_error_b`, cast(cast(`tmp1`.`is_inc_error` as int) as string) AS `is_inc_error_a`, cast(cast(`tmp2`.`is_inc_error` as int) as string) AS `is_inc_error_b`, cast(cast(`tmp1`.`is_non_inc_error` as int) as string) AS `is_non_inc_error_a`, cast(cast(`tmp2`.`is_non_inc_error` as int) as string) AS `is_non_inc_error_b`, FORMAT_TIMESTAMP('%F %H:%M:%E6S', `tmp1`.`created_at`) AS `created_at_a`,
FORMAT_TIMESTAMP('%F %H:%M:%E6S', `tmp2`.`created_at`) AS `created_at_b`, cast(cast(`tmp1`.`is_x` as int) as string) AS `is_x_a`, cast(cast(`tmp2`.`is_x` as int) as string) AS `is_x_b`, cast(cast(`tmp1`.`is_inc` as int) as string) AS `is_inc_a`, cast(cast(`tmp2`.`is_inc` as int) as string) AS `is_inc_b`, cast(cast(`tmp1`.`is_x_error` as int) as string) AS
`is_x_error_a`, cast(cast(`tmp2`.`is_x_error` as int) as string) AS `is_x_error_b`, cast(`tmp1`.`unit` as string) AS `unit_a`, cast(`tmp2`.`unit` as string) AS `unit_b`, cast(cast(`tmp1`.`is_missing_t` as int) as string) AS `is_missing_t_a`, cast(cast(`tmp2`.`is_missing_t` as int) as string) AS `is_missing_t_b` FROM
`data-prod`.`prod_schema`.`fct_tbl_info``tmp1` FULL OUTER JOIN `data-prod`.`dbt_pr_test_ci_schema`.`fct_tbl_info``tmp2` ON (`tmp1`.`col_id` = `tmp2`.`col_id`)) tmp3 WHERE ((`is_diff_col_id` =1) OR (`is_diff_org_id` =1) OR (`is_diff_is_error` =1) OR (`is_diff_is_inc_error` =1) OR (`is_diff_is_non_inc_error` =1) OR (`is_diff_created_at` =1) OR (`is_diff_is_x` = 1) OR (`is_diff_is_inc` = 1) OR (`is_diff_is_x_error` = 1) OR (`is_diff_unit` = 1) OR (`is_diff_is_missing_t` = 1)) AND (`is_exclusive_a` OR `is_exclusive_b`)) tmp4
15:44:40 DEBUG Counting differences per column: ('data-prod', 'prod_schema', 'int_table') <> ('data-prod', 'dbt_pr_test_ci_schema', 'int_table') joindiff_tables.py:346
DEBUG Running SQL (BigQuery): ('data-prod', 'prod_schema', 'int_table') <> ('data-prod', 'dbt_pr_test_ci_schema', 'int_table') base.py:980
SELECT sum(`is_diff_col_id`), sum(`is_diff_col_value`), sum(`is_diff_org_id`), sum(`is_diff_col_combined_value`), sum(`is_diff_col_mtd`), sum(`is_diff_sf_id`), sum(`is_diff_col_ch_prob`), sum(`is_diff_col_name`), sum(`is_diff_col_pl`), sum(`is_diff_col_type`) FROM (SELECT (`tmp2`.`col_id` IS NULL) AS `is_exclusive_a`, (`tmp1`.`col_id` IS NULL) AS `is_exclusive_b`, CASE WHEN
`tmp1`.`col_id` is distinct from `tmp2`.`col_id` THEN 1 ELSE 0 END AS `is_diff_col_id`, CASE WHEN `tmp1`.`col_value` is distinct from `tmp2`.`col_value` THEN 1 ELSE 0 END AS `is_diff_col_value`, CASE WHEN `tmp1`.`org_id` is distinct from `tmp2`.`org_id` THEN 1 ELSE 0 END AS `is_diff_org_id`, CASE WHEN `tmp1`.`col_combined_value` is distinct from `tmp2`.`col_combined_value` THEN 1 ELSE 0 END AS
`is_diff_col_combined_value`, CASE WHEN `tmp1`.`col_mtd` is distinct from `tmp2`.`col_mtd` THEN 1 ELSE 0 END AS `is_diff_col_mtd`, CASE WHEN `tmp1`.`sf_id` is distinct from `tmp2`.`sf_id` THEN 1 ELSE 0 END AS `is_diff_sf_id`, CASE WHEN `tmp1`.`col_ch_prob` is distinct from `tmp2`.`col_ch_prob` THEN 1 ELSE 0 END AS `is_diff_col_ch_prob`, CASE WHEN `tmp1`.`col_name` is distinct
from `tmp2`.`col_name` THEN 1 ELSE 0 END AS `is_diff_col_name`, CASE WHEN `tmp1`.`col_pl` is distinct from `tmp2`.`col_pl` THEN 1 ELSE 0 END AS `is_diff_col_pl`, CASE WHEN `tmp1`.`col_type` is distinct from `tmp2`.`col_type` THEN 1 ELSE 0 END AS `is_diff_col_type`, cast(`tmp1`.`col_id` as string) AS `col_id_a`, cast(`tmp2`.`col_id` as string) AS `col_id_b`, format('%.11f', `tmp1`.`col_value`) AS
`col_value_a`, format('%.11f', `tmp2`.`col_value`) AS `col_value_b`, cast(`tmp1`.`org_id` as string) AS `org_id_a`, cast(`tmp2`.`org_id` as string) AS `org_id_b`, format('%.11f', `tmp1`.`col_combined_value`) AS `col_combined_value_a`, format('%.11f', `tmp2`.`col_combined_value`) AS `col_combined_value_b`, cast(`tmp1`.`col_mtd` as string) AS `col_mtd_a`,
cast(`tmp2`.`col_mtd` as string) AS `col_mtd_b`, cast(`tmp1`.`sf_id` as string) AS `sf_id_a`, cast(`tmp2`.`sf_id` as string) AS `sf_id_b`, format('%.11f', `tmp1`.`col_ch_prob`) AS `col_ch_prob_a`, format('%.11f', `tmp2`.`col_ch_prob`) AS `col_ch_prob_b`, cast(`tmp1`.`col_name` as string) AS `col_name_a`, cast(`tmp2`.`col_name` as string) AS `col_name_b`,
cast(`tmp1`.`col_pl` as string) AS `col_pl_a`, cast(`tmp2`.`col_pl` as string) AS `col_pl_b`, cast(`tmp1`.`col_type` as string) AS `col_type_a`, cast(`tmp2`.`col_type` as string) AS `col_type_b` FROM `data-prod`.`prod_schema`.`int_table``tmp1` FULL OUTER JOIN
`data-prod`.`dbt_pr_test_ci_schema`.`int_table``tmp2` ON (`tmp1`.`col_id` = `tmp2`.`col_id`)) tmp3 WHERE ((`is_diff_col_id` =1) OR (`is_diff_col_value` =1) OR (`is_diff_org_id` =1) OR (`is_diff_col_combined_value` =1) OR (`is_diff_col_mtd` =1) OR (`is_diff_sf_id` =1) OR (`is_diff_col_ch_prob` =1) OR (`is_diff_col_name` =1) OR (`is_diff_col_pl` = 1) OR (`is_diff_col_type` = 1))
DEBUG Counting differences per column: ('data-prod', 'prod_schema', 'fct_tbl_info') <> ('data-prod', 'dbt_pr_test_ci_schema', 'fct_tbl_info') joindiff_tables.py:346
DEBUG Running SQL (BigQuery): ('data-prod', 'prod_schema', 'fct_tbl_info') <> ('data-prod', 'dbt_pr_test_ci_schema', 'fct_tbl_info') base.py:980
SELECT sum(`is_diff_col_id`), sum(`is_diff_org_id`), sum(`is_diff_is_error`), sum(`is_diff_is_inc_error`), sum(`is_diff_is_non_inc_error`), sum(`is_diff_created_at`), sum(`is_diff_is_x`), sum(`is_diff_is_inc`), sum(`is_diff_is_x_error`), sum(`is_diff_unit`), sum(`is_diff_is_missing_t`) FROM (SELECT (`tmp2`.`col_id` IS NULL) AS `is_exclusive_a`, (`tmp1`.`col_id` IS NULL) AS
`is_exclusive_b`, CASE WHEN `tmp1`.`col_id` is distinct from `tmp2`.`col_id` THEN 1 ELSE 0 END AS `is_diff_col_id`, CASE WHEN `tmp1`.`org_id` is distinct from `tmp2`.`org_id` THEN 1 ELSE 0 END AS `is_diff_org_id`, CASE WHEN `tmp1`.`is_error` is distinct from `tmp2`.`is_error` THEN 1 ELSE 0 END AS `is_diff_is_error`, CASE WHEN `tmp1`.`is_inc_error` is distinct from `tmp2`.`is_inc_error` THEN 1 ELSE 0 END AS
`is_diff_is_inc_error`, CASE WHEN `tmp1`.`is_non_inc_error` is distinct from `tmp2`.`is_non_inc_error` THEN 1 ELSE 0 END AS `is_diff_is_non_inc_error`, CASE WHEN `tmp1`.`created_at` is distinct from `tmp2`.`created_at` THEN 1 ELSE 0 END AS `is_diff_created_at`, CASE WHEN `tmp1`.`is_x` is distinct from `tmp2`.`is_x` THEN 1 ELSE 0 END AS `is_diff_is_x`, CASE WHEN `tmp1`.`is_inc` is distinct from
`tmp2`.`is_inc` THEN 1 ELSE 0 END AS `is_diff_is_inc`, CASE WHEN `tmp1`.`is_x_error` is distinct from `tmp2`.`is_x_error` THEN 1 ELSE 0 END AS `is_diff_is_x_error`, CASE WHEN `tmp1`.`unit` is distinct from `tmp2`.`unit` THEN 1 ELSE 0 END AS `is_diff_unit`, CASE WHEN `tmp1`.`is_missing_t` is distinct from `tmp2`.`is_missing_t` THEN 1 ELSE 0 END AS
`is_diff_is_missing_t`, cast(`tmp1`.`col_id` as string) AS `col_id_a`, cast(`tmp2`.`col_id` as string) AS `col_id_b`, cast(`tmp1`.`org_id` as string) AS `org_id_a`, cast(`tmp2`.`org_id` as string) AS `org_id_b`, cast(cast(`tmp1`.`is_error` as int) as string) AS `is_error_a`, cast(cast(`tmp2`.`is_error` as int) as string) AS `is_error_b`, cast(cast(`tmp1`.`is_inc_error` as int) as string) AS
`is_inc_error_a`, cast(cast(`tmp2`.`is_inc_error` as int) as string) AS `is_inc_error_b`, cast(cast(`tmp1`.`is_non_inc_error` as int) as string) AS `is_non_inc_error_a`, cast(cast(`tmp2`.`is_non_inc_error` as int) as string) AS `is_non_inc_error_b`, FORMAT_TIMESTAMP('%F %H:%M:%E6S', `tmp1`.`created_at`) AS `created_at_a`, FORMAT_TIMESTAMP('%F %H:%M:%E6S', `tmp2`.`created_at`) AS `created_at_b`,
cast(cast(`tmp1`.`is_x` as int) as string) AS `is_x_a`, cast(cast(`tmp2`.`is_x` as int) as string) AS `is_x_b`, cast(cast(`tmp1`.`is_inc` as int) as string) AS `is_inc_a`, cast(cast(`tmp2`.`is_inc` as int) as string) AS `is_inc_b`, cast(cast(`tmp1`.`is_x_error` as int) as string) AS `is_x_error_a`, cast(cast(`tmp2`.`is_x_error` as int) as string) AS
`is_x_error_b`, cast(`tmp1`.`unit` as string) AS `unit_a`, cast(`tmp2`.`unit` as string) AS `unit_b`, cast(cast(`tmp1`.`is_missing_t` as int) as string) AS `is_missing_t_a`, cast(cast(`tmp2`.`is_missing_t` as int) as string) AS `is_missing_t_b` FROM `data-prod`.`prod_schema`.`fct_tbl_info``tmp1` FULL OUTER JOIN
`data-prod`.`dbt_pr_test_ci_schema`.`fct_tbl_info``tmp2` ON (`tmp1`.`col_id` = `tmp2`.`col_id`)) tmp3 WHERE ((`is_diff_col_id` =1) OR (`is_diff_org_id` =1) OR (`is_diff_is_error` =1) OR (`is_diff_is_inc_error` =1) OR (`is_diff_is_non_inc_error` =1) OR (`is_diff_created_at` =1) OR (`is_diff_is_x` =1) OR (`is_diff_is_inc` =1) OR (`is_diff_is_x_error` =1) OR (`is_diff_unit` = 1) OR (`is_diff_is_missing_t` = 1))
15:44:41 INFO Diffing complete: ('data-prod', 'prod_schema', 'int_table') <> ('data-prod', 'dbt_pr_test_ci_schema', 'int_table') joindiff_tables.py:165
{"status": "success", "result": "identical", "model": "model.dbt.int_table", "dataset1": ["data-prod", "prod_schema", "int_table"], "dataset2": ["data-prod", "dbt_pr_test_ci_schema", "int_table"], "rows": {"exclusive": {"dataset1": [], "dataset2": []}, "diff": []}, "summary": {"rows": {"total": {"dataset1": 27346, "dataset2": 27346},
"exclusive": {"dataset1": 0, "dataset2": 0}, "updated": 0, "unchanged": 27346}, "stats": {"diffCounts": {"col_value": 0, "org_id": 0, "col_combined_value": 0, "col_mtd": 0, "sf_id": 0, "col_ch_prob": 0, "col_name": 0, "col_pl": 0, "col_type": 0}}}, "columns": {"dataset1": [{"name": "col_id", "type": "INT64", "kind": "integer"}, {"name": "org_id", "type": "INT64", "kind": "integer"}, {"name": "sf_id", "type":
"STRING", "kind": "unsupported"}, {"name": "col_name", "type": "STRING", "kind": "unsupported"}, {"name": "col_type", "type": "STRING", "kind": "unsupported"}, {"name": "col_value", "type": "FLOAT64", "kind": "float"}, {"name": "col_combined_value", "type": "FLOAT64", "kind": "float"}, {"name": "col_mtd", "type": "STRING", "kind": "unsupported"}, {"name": "col_pl", "type": "STRING", "kind": "unsupported"}, {"name": "col_ch_prob", "type": "FLOAT64",
"kind": "float"}], "dataset2": [{"name": "col_id", "type": "INT64", "kind": "integer"}, {"name": "org_id", "type": "INT64", "kind": "integer"}, {"name": "sf_id", "type": "STRING", "kind": "unsupported"}, {"name": "col_name", "type": "STRING", "kind": "unsupported"}, {"name": "col_type", "type": "STRING", "kind": "unsupported"}, {"name": "col_value", "type": "FLOAT64", "kind": "float"}, {"name": "col_combined_value", "type": "FLOAT64", "kind": "float"},
{"name": "col_mtd", "type": "STRING", "kind": "unsupported"}, {"name": "col_pl", "type": "STRING", "kind": "unsupported"}, {"name": "col_ch_prob", "type": "FLOAT64", "kind": "float"}], "primaryKey": ["col_id"], "exclusive": {"dataset1": [], "dataset2": []}, "typeChanged": []}, "version": "1.1.0"}
15:45:22 INFO Diffing complete: ('data-prod', 'prod_schema', 'fct_tbl_info') <> ('data-prod', 'dbt_pr_test_ci_schema', 'fct_tbl_info') joindiff_tables.py:165
⠼
In Progress fct_tbl_info
In Progress int_table
Diffing complete: ('data-prod', 'prod_schema', 'fct_tbl
The last line is also not fully shown. It's cut before even the table name.
Describe the environment
I'm using macOS 14.3.1
The text was updated successfully, but these errors were encountered:
This issue has been marked as stale because it has been open for 60 days with no activity. If you would like the issue to remain open, please comment on the issue and it will be added to the triage queue. Otherwise, it will be closed in 7 days.
I'm sorry for the delay in responding. Thank you for trying out data-diff and for opening this issue!
We made a hard decision to sunset the data-diff package and won't provide further development or support. Diffing functionality will continue to be available in Datafold Cloud. Feel free to take it for a trial or contact us at [email protected] if you have any questions.
Describe the bug
I'm trying to get a json output from a
--dbt
run which uses a state file. It works fine if there is no--json
flag. But when I add the json flag, it gets stuck and process never finishes.Make sure to include the following (minus sensitive information):
sh data-diff --dbt --state prod-run-artifacts/manifest.json --json -d
The last line is also not fully shown. It's cut before even the table name.
Describe the environment
I'm using macOS 14.3.1
The text was updated successfully, but these errors were encountered: