Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explicitly set Delta table props to accommodate for different defaults [databricks] #11970

Merged
merged 11 commits into from
Jan 31, 2025
17 changes: 9 additions & 8 deletions integration_tests/src/main/python/delta_lake_utils.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright (c) 2023, NVIDIA CORPORATION.
# Copyright (c) 2023-2025, NVIDIA CORPORATION.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
Expand All @@ -16,7 +16,7 @@
import os.path
import re

from spark_session import is_databricks122_or_later
from spark_session import is_databricks122_or_later, supports_delta_lake_deletion_vectors

delta_meta_allow = [
"DeserializeToObjectExec",
Expand Down Expand Up @@ -157,12 +157,13 @@ def setup_delta_dest_table(spark, path, dest_table_func, use_cdf, partition_colu
dest_df = dest_table_func(spark)
writer = dest_df.write.format("delta")
ddl = schema_to_ddl(spark, dest_df.schema)
table_properties = {}
if use_cdf:
table_properties['delta.enableChangeDataFeed'] = 'true'
if enable_deletion_vectors:
table_properties['delta.enableDeletionVectors'] = 'true'
if len(table_properties) > 0:
table_properties = {
'delta.enableChangeDataFeed': str(use_cdf).lower(),
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is much cleaner.

if supports_delta_lake_deletion_vectors():
table_properties['delta.enableDeletionVectors'] = str(enable_deletion_vectors).lower()

gerashegalov marked this conversation as resolved.
Show resolved Hide resolved
if supports_delta_lake_deletion_vectors():
# if any table properties are specified then we need to use SQL to define the table
sql_text = "CREATE TABLE delta.`{path}` ({ddl}) USING DELTA".format(path=path, ddl=ddl)
if partition_columns:
Expand Down
Loading