You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your idea related to a problem? Please describe.
We currently use the functionality of wrangler.s3.to_parquet(mode="overwrite_partitions") to write our data to glue tables, that is because we run daily pipelines and write it to a daily partition, and if we want to backfill the table data with new columns we overwrite those partitions.
we want to start using iceberg as our format for schema evolution but we cannot achieve that functionality with wrangler.athena.to_iceberg()
Describe the solution you'd like
wrangler.athena.to_iceberg(mode="overwrite_partition" or "update") would be nice or maybe some other solution we can do now, I have been thinking about executing "INSERT OVERWRITE" statment and filter the overwrite to the partition I want to overwrite
EDIT
After looking at the source code I saw the merge_cols option in to_iceberg, this seems like a good fit but it lacks the ability to delete, it either update or insert but what happened when there are rows in the original tables which are not in the new table(we want to delete those rows).
The text was updated successfully, but these errors were encountered:
Is your idea related to a problem? Please describe.
We currently use the functionality of wrangler.s3.to_parquet(mode="overwrite_partitions") to write our data to glue tables, that is because we run daily pipelines and write it to a daily partition, and if we want to backfill the table data with new columns we overwrite those partitions.
we want to start using iceberg as our format for schema evolution but we cannot achieve that functionality with wrangler.athena.to_iceberg()
Describe the solution you'd like
wrangler.athena.to_iceberg(mode="overwrite_partition" or "update") would be nice or maybe some other solution we can do now, I have been thinking about executing "INSERT OVERWRITE" statment and filter the overwrite to the partition I want to overwrite
EDIT
After looking at the source code I saw the merge_cols option in to_iceberg, this seems like a good fit but it lacks the ability to delete, it either update or insert but what happened when there are rows in the original tables which are not in the new table(we want to delete those rows).
The text was updated successfully, but these errors were encountered: