Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Changing Incremental Model Partitions on Bigquery #539

Open
2 tasks done
jennlw opened this issue Sep 20, 2024 · 1 comment
Open
2 tasks done

[Bug] Changing Incremental Model Partitions on Bigquery #539

jennlw opened this issue Sep 20, 2024 · 1 comment
Assignees
Labels
feature:incremental Issues related to incremental materializations help_wanted Extra attention is needed pkg:dbt-bigquery Issue affects dbt-bigquery type:bug Something isn't working as documented

Comments

@jennlw
Copy link

jennlw commented Sep 20, 2024

Is this a new bug in dbt-bigquery?

  • I believe this is a new bug in dbt-bigquery
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

Recently ran into an issue with Bigquery and incremental model partitions.

When the partition of an existing incremental model needs to be changed, the dbt full refresh command runs a create or replace statement against the model, which fails with the error:

Incompatible table partitioning specification. Expected partitioning specification interval(type:,field:<field_1>), but input partitioning specification is interval(type:,field:<field_2>)

This indicates that despite the replace statement, Bigquery expects the same partitioning specification as the original table.
We found that running a drop statement followed by a create statement succeeds with no errors.

Ideally dbt would be able to generate these statements on full refresh.

Please let us know your thoughts!

Expected Behavior

Full refresh should be able to generate a drop/create on full refresh.

Steps To Reproduce

Have incremental model partitions
When the partition of this model needs to be changed, get error when dbt full refresh command runs create/replace against the model.

Relevant log output

This is reported by the customer. I will ask them to subscribe to this Bug so that they can add comments if additional questions/input are needed.

Environment

- OS:
- Python:
- dbt-core:
- dbt-bigquery:

Additional Context

No response

@jennlw jennlw added type:bug Something isn't working as documented triage:product In Product's queue labels Sep 20, 2024
@amychen1776 amychen1776 added feature:incremental Issues related to incremental materializations help_wanted Extra attention is needed and removed triage:product In Product's queue labels Oct 1, 2024
@mikealfare mikealfare added the pkg:dbt-bigquery Issue affects dbt-bigquery label Jan 14, 2025
@mikealfare mikealfare transferred this issue from dbt-labs/dbt-bigquery Jan 14, 2025
@adrianburusdbt
Copy link
Contributor

Hey,
I am trying to replicate the error but so far I have not been able to. Here is are the steps I've been doing, please let me know if something is not right

  1. Created an incremental model using event_id as partition field called model1:
{{
    config(
        materialized='incremental',
        partition_by={
            "field": "event_date",
            "data_type": "date"
        },
        unique_key='event_id'
    )
}}

WITH sample_data AS (
    SELECT
        1 AS event_id,
        'user_123' AS user_id,
        TIMESTAMP('2024-01-30 12:00:00') AS event_timestamp,
        DATE(TIMESTAMP('2024-01-30 12:00:00')) AS event_date,
        '{"action": "login"}' AS event_data
    UNION ALL
    SELECT
        2,
        'user_456',
        TIMESTAMP('2024-01-30 13:15:00'),
        DATE(TIMESTAMP('2024-01-30 13:15:00')),
        '{"action": "purchase", "amount": 50}'
    UNION ALL
    SELECT
        3,
        'user_789',
        TIMESTAMP('2024-01-30 14:30:00'),
        DATE(TIMESTAMP('2024-01-30 14:30:00')),
        '{"action": "logout"}'
)

SELECT
    event_id,
    user_id,
    event_timestamp,
    DATE(event_timestamp) AS event_date,
    event_data
FROM sample_data

{% if is_incremental() %}
WHERE event_date > COALESCE((SELECT MAX(event_date) FROM {{ this }}), DATE('1900-01-01'))
{% endif %}
  1. Run dbt run --full-refresh --select model1 -> this generated the model1 table correctly in BQ with event_date as partition field
  2. Change model1 to use event_timestamp as partition field
  3. Run dbt run --full-refresh --select model1 again -> the model1 table in BQ is updated to use event_timestamp as partition field

Can you please confirm that the steps above are the same as yours? Thanks!

@adrianburusdbt adrianburusdbt self-assigned this Jan 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature:incremental Issues related to incremental materializations help_wanted Extra attention is needed pkg:dbt-bigquery Issue affects dbt-bigquery type:bug Something isn't working as documented
Projects
None yet
Development

No branches or pull requests

4 participants