Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Destination S3 Data Lake: Force schema change in truncate sync #53216

Merged
merged 18 commits into from
Feb 12, 2025

Conversation

edgao
Copy link
Contributor

@edgao edgao commented Feb 7, 2025

closes https://github.com/airbytehq/airbyte-internal-issues/issues/11611

old behavior:

  • crash on any incompatible schema change
  • schema change is applied at the start of the sync

new behavior:

  • if we're in a single-generation truncate (i.e. "truncate refresh" or "clear"):
    • instead of finding a supertype, just use the new type as the schema
    • write data using the new schema
    • don't commit the schema change until end of sync
  • if we're in a non-truncate sync, keep the old behavior

Copy link

vercel bot commented Feb 7, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
airbyte-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Feb 12, 2025 6:50pm

@edgao edgao force-pushed the edgao/iceberg_force_schema_update_on_truncate branch 2 times, most recently from 2eca3fa to cee47ac Compare February 10, 2025 22:33
@edgao edgao requested a review from frifriSF59 February 11, 2025 17:30
@edgao edgao marked this pull request as ready for review February 11, 2025 17:30
@edgao edgao requested a review from a team as a code owner February 11, 2025 17:30

return table.schema()
// `apply` just validates that the schema change is valid, it doesn't actually commit().
// It returns the schema that the table _would_ have after committing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this comment. This is confusing

@edgao edgao enabled auto-merge (squash) February 12, 2025 18:44
@octavia-squidington-iii octavia-squidington-iii added the area/documentation Improvements or additions to documentation label Feb 12, 2025
@edgao edgao merged commit 3985e69 into master Feb 12, 2025
29 checks passed
@edgao edgao deleted the edgao/iceberg_force_schema_update_on_truncate branch February 12, 2025 18:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation CDK Connector Development Kit connectors/destination/s3-data-lake
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants