Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Credentials do not renew automatically #26

Open
TheEdgeOfRage opened this issue Feb 2, 2024 · 6 comments
Open

Credentials do not renew automatically #26

TheEdgeOfRage opened this issue Feb 2, 2024 · 6 comments

Comments

@TheEdgeOfRage
Copy link

TheEdgeOfRage commented Feb 2, 2024

We use k8s ServiceAccounts that assume IAM roles using OIDC and run DuckDB inside a k8s pod with the proper service account set. On startup, the following commands are sent to DuckDB:

INSTALL 'httpfs'
LOAD 'httpfs'
INSTALL 'aws'
LOAD 'aws'
CALL load_aws_credentials()

Our pod then serves requests and fetches data from S3 on request, meaning that it might be asked to get data from S3 much later compared to pod startup. The problem is that the STS token has expired by then, so DuckDB fails to get the data from S3. What's the best approach to renew these credentials, or ideally, could this plugin be updated to renew them automatically once they expire?

@samansmink
Copy link
Collaborator

Thanks for reporting, token refresh is not yet supported and definitely something to consider

@CrashLaker
Copy link

CrashLaker commented Mar 11, 2024

hi @samansmink ,
could you please explain in a little bit more of detail how does duckdb's threads use the credentials when running on a single machine?
this question relates to the question i've asked here duckdb/duckdb#10996

in there i've 1 maquine spawning many threads to read many S3 objects. my problem is that this breaks as duckdb.duckdb.HTTPException: HTTP Error: HTTP GET error on 'https://bucket/prefix/filename.json' (HTTP 500) after some time.

my question is. when each thread starts. does it fetch the credentials only once to then load the files.
or I could for example run another process in background that updates the credentials in ~/.aws/config

regards,c.

@samansmink
Copy link
Collaborator

Hi @CrashLaker sure!

does it fetch the credentials only once to

yes, it does this once, on secret creation.

I'm not sure what the 500 error is, AWS docs are not very clear: https://repost.aws/knowledge-center/http-5xx-errors-s3 could maybe be a throttling thing?

You could try messing with the http_retries settings here https://duckdb.org/docs/configuration/overview#configuration-reference

@CrashLaker
Copy link

hi @samansmink ,

thank you for your reply.
actually after I sent you the message I realized that the 500 status code wasn't actually due to expiration (that would be 403) and then I was trying to find ways to tweak that exponential backoff setting.

you've pointed me to the right place in docs. thank you so much! i'll try.

regards,c.

@valerydluski
Copy link

@samansmink Are there any updates on this issue?

@samansmink
Copy link
Collaborator

No, sorry i can't give a timeline for automatic credential refreshing at this point

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants