-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explicitly set Delta table props to accommodate for different defaults [databricks] #11970
Changes from 4 commits
3273a1c
12f27b5
9ae918f
cfe122f
0f6321a
a17a1b6
4a7b332
9bed5d0
6a6a5a1
3f95aab
dbbd288
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
# Copyright (c) 2023, NVIDIA CORPORATION. | ||
# Copyright (c) 2023-2025, NVIDIA CORPORATION. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
|
@@ -157,12 +157,15 @@ def setup_delta_dest_table(spark, path, dest_table_func, use_cdf, partition_colu | |
dest_df = dest_table_func(spark) | ||
writer = dest_df.write.format("delta") | ||
ddl = schema_to_ddl(spark, dest_df.schema) | ||
table_properties = {} | ||
if use_cdf: | ||
table_properties['delta.enableChangeDataFeed'] = 'true' | ||
if enable_deletion_vectors: | ||
table_properties['delta.enableDeletionVectors'] = 'true' | ||
if len(table_properties) > 0: | ||
table_properties = { | ||
'delta.enableChangeDataFeed': str(use_cdf).lower(), | ||
} | ||
# prevent on 11.3: Unknown configuration was specified: delta.enableDeletionVectors | ||
# TODO OSS Delta 2.3+ | ||
if is_databricks122_or_later(): | ||
table_properties['delta.enableDeletionVectors'] = str(enable_deletion_vectors).lower() | ||
|
||
gerashegalov marked this conversation as resolved.
Show resolved
Hide resolved
|
||
if use_cdf or enable_deletion_vectors: | ||
# if any table properties are specified then we need to use SQL to define the table | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What happens if we enable deletion_vectors and use_cdf is False, but we are not on databrick 122 or later? I know that this didn't really change from the previous code. It is just confusing to me and I would like to understand better what is happening. nit: Also why is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We cannot enable_deleltion_vectors before 12.2, hence all tests where enable_deletion_vectors may be True are
re: write.mode("append"). I did not want to introduce changes beyond the absolute minimum. |
||
sql_text = "CREATE TABLE delta.`{path}` ({ddl}) USING DELTA".format(path=path, ddl=ddl) | ||
if partition_columns: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is much cleaner.