Skip to content

Commit

Permalink
feat: add support for S3 Extra arguments in Seed file upload to s3 (#397
Browse files Browse the repository at this point in the history
)

Co-authored-by: Julien Kervizic <[email protected]>
Co-authored-by: Serhii Dimchenko <[email protected]>
  • Loading branch information
3 people authored Sep 11, 2023
1 parent b1c44e9 commit 9d029bb
Show file tree
Hide file tree
Showing 6 changed files with 13 additions and 1 deletion.
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,7 @@ A dbt profile can be configured to run against AWS Athena using the following co
| aws_profile_name | Profile to use from your AWS shared credentials file. | Optional | `my-profile` |
| work_group | Identifier of Athena workgroup | Optional | `my-custom-workgroup` |
| num_retries | Number of times to retry a failing query | Optional | `3` |
| seed_s3_upload_args | Dictionary containing boto3 ExtraArgs when uploading to S3 | Optional | `{"ACL": "bucket-owner-full-control"}` |
| lf_tags_database | Default LF tags for new database if it's created by dbt | Optional | `tag_key: tag_value` |

**Example profiles.yml entry:**
Expand All @@ -105,6 +106,8 @@ athena:
database: awsdatacatalog
aws_profile_name: my-profile
work_group: my-workgroup
seed_s3_upload_args:
ACL: bucket-owner-full-control
```
_Additional information_
Expand Down
2 changes: 2 additions & 0 deletions dbt/adapters/athena/connections.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ class AthenaCredentials(Credentials):
num_retries: Optional[int] = 5
s3_data_dir: Optional[str] = None
s3_data_naming: Optional[str] = "schema_table_unique"
seed_s3_upload_args: Optional[Dict[str, Any]] = None
# Unfortunately we can not just use dict, must by Dict because we'll get the following error:
# Credentials in profile "athena", target "athena" invalid: Unable to create schema for 'dict'
lf_tags_database: Optional[Dict[str, str]] = None
Expand Down Expand Up @@ -86,6 +87,7 @@ def _connection_keys(self) -> Tuple[str, ...]:
"s3_data_dir",
"s3_data_naming",
"debug_query_state",
"seed_s3_upload_args",
"lf_tags_database",
)

Expand Down
3 changes: 2 additions & 1 deletion dbt/adapters/athena/impl.py
Original file line number Diff line number Diff line change
Expand Up @@ -314,6 +314,7 @@ def upload_seed_to_s3(
s3_data_dir: Optional[str] = None,
s3_data_naming: Optional[str] = None,
external_location: Optional[str] = None,
seed_s3_upload_args: Optional[Dict[str, Any]] = None,
) -> str:
conn = self.connections.get_thread_connection()
client = conn.handle
Expand All @@ -332,7 +333,7 @@ def upload_seed_to_s3(
# This ensures cross-platform support, tempfile.NamedTemporaryFile does not
tmpfile = os.path.join(tempfile.gettempdir(), os.urandom(24).hex())
table.to_csv(tmpfile, quoting=csv.QUOTE_NONNUMERIC)
s3_client.upload_file(tmpfile, bucket, object_name)
s3_client.upload_file(tmpfile, bucket, object_name, ExtraArgs=seed_s3_upload_args)
os.remove(tmpfile)

return str(s3_location)
Expand Down
2 changes: 2 additions & 0 deletions dbt/include/athena/macros/materializations/seeds/helpers.sql
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,7 @@
{%- set s3_data_dir = config.get('s3_data_dir', default=target.s3_data_dir) -%}
{%- set s3_data_naming = config.get('s3_data_naming', target.s3_data_naming) -%}
{%- set external_location = config.get('external_location', default=none) -%}
{%- set seed_s3_upload_args = config.get('seed_s3_upload_args', default=target.seed_s3_upload_args) -%}

{%- set tmp_relation = api.Relation.create(
identifier=identifier + "__dbt_tmp",
Expand All @@ -110,6 +111,7 @@
s3_data_dir,
s3_data_naming,
external_location,
seed_s3_upload_args=seed_s3_upload_args
) -%}

-- create target relation
Expand Down
3 changes: 3 additions & 0 deletions dbt/include/athena/profile_template.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@ prompts:
hint: Specify the database (Data catalog) to build models into (lowercase only)
default: awsdatacatalog

seed_s3_upload_args:
hint: Specify any extra arguments to use in the S3 Upload, e.g. ACL, SSEKMSKeyId

threads:
hint: '1 or more'
type: 'int'
Expand Down
1 change: 1 addition & 0 deletions dev-requirements.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
autoflake~=1.7
black~=23.9
boto3-stubs[s3]~=1.28
dbt-tests-adapter~=1.6.2
flake8~=6.1
Flake8-pyproject~=1.2
Expand Down

0 comments on commit 9d029bb

Please sign in to comment.