Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vine: Manage Expiring Credentials #3777

Open
dthain opened this issue Apr 23, 2024 · 3 comments
Open

Vine: Manage Expiring Credentials #3777

dthain opened this issue Apr 23, 2024 · 3 comments

Comments

@dthain
Copy link
Member

dthain commented Apr 23, 2024

Background:
In the prototype analysis facility, one needs a credential in a file to present to XRootD. When logging into Coffea-CASA, the token is created automatically for the user, and then automatically renewed every ~30min. The file containing the credential is automatically updated by the surrounding environment.

Current Behavior: TaskVine gets upset because it assumes files are immutable and doesn't like them changed unexpectedly. At the moment, it will leave the old file cached at workers that have it, but new workers will get the new file. And so tasks running on workers with the old credential start to fail.

Assumptions:

  • Individual task runtimes are less than the credential lifetime. (Is this a good assumption?)

Outcome Desired:

  • Every time a task is dispatched, it should get the most recent version of the credential.
  • A running task should not need an existing credential updated.
  • It is tolerable if a running task fails due to an expired credential, as long as we can discern that reason and retry it.

Questions

  • What is the actual lifetime of the credential?
  • How frequently is the credential renewed?
  • Is the credential needed throughout a running task, or only at startup/connection?
  • How big are credentials? (less than 1K)

Global Issues

  • Throughout, TaskVine assumes that files are immutable and have unique cachenames, and those files can be shared by concurrent tasks at a given worker. So, any update to the content of a file must be done by delivering a new cachename.

Possible Solution Concepts:
0 - Credential files should just not be cached, send them to every task. Maybe have a flag that says "don't complain on changes". But, our current interpretation of "no cache" also means "no sharing", and so we would need a mode in which every use of the credential file gets a new cachename. This gets complicated fast.
1 - Allow the contents of a (local) file to change, which results in a new cachename and binding to replicas. So we need to look for that change, and update replicas as needed when it does.
2 - Define a "credential" as some bits attached to a task, gets transmitted with the task, gets deposited in a file, has environment var associated with it? Not really treated as a file. Easy to implement, only downside is that it adds some overhead for tasks that may be small.
3 - Define a "credential" as a sort of factory for a file object that is resolved as dispatch time. The credential object causes us to look at the filesystem, and then construct the appropriate file object. Downside: corner cases come from adding a file object that the user didn't ask for.

@dthain
Copy link
Member Author

dthain commented Apr 23, 2024

Strawman solution:

At the point where a file is about to be sent to the worker, check whether it has changed:
https://github.com/cooperative-computing-lab/cctools/blob/d76a90034ddd1071d172b8c1b4a78e5173e7bf56/taskvine/src/manager/vine_manager_put.c#L425C1-L426C1

If it has, then emit a warning, recompute the cachename and metadata, and then continue as normal.

The result is that the new file will be sent, the old one will remain, and the only big downside is that stale replicas of the file will not have any reason to be deleted.

@dthain
Copy link
Member Author

dthain commented Apr 23, 2024

Only allow if user gives explicit flag VINE_MAYCHANGE ?

@btovar btovar self-assigned this Apr 23, 2024
@btovar
Copy link
Member

btovar commented Apr 24, 2024

Something that would work for the present case is that if it has the VINE_MAYCHANGE flag we can copy the file to the sandbox rather than link it. In that way it can be sent again to the workers without worrying about running tasks. The two credentials are less than 1.5K together.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

No branches or pull requests

2 participants