Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More complete HDFS / WASB style path structure #153

Closed
isaacabraham opened this issue Nov 21, 2015 · 4 comments
Closed

More complete HDFS / WASB style path structure #153

isaacabraham opened this issue Nov 21, 2015 · 4 comments

Comments

@isaacabraham
Copy link
Contributor

There's a need to have the ability to specify the account as part of the path e.g.

"customerAccount@container/folder/folder/file.txt"
|> CloudFlow.ofFileByLine

etc. etc.

I've raised this as its own issue as it's probably an enabler for a number of scenarios.

@eiriktsarpalis
Copy link
Member

MBrace.Core and by extension MBrace.Flow do not on themselves perform any type of parsing on the paths. This job is delegated to the ICloudFileStore abstraction that the current runtime happens to be using.

So I think this really is an MBrace.Azure issue: we should consider whether the concrete implementation of ICloudFileStore, BlobStore should support multiple storage accounts and recognise WASB-style paths.

If we decide to go for this approach, there are a few ramifications that might be worth considering:

  1. How will the cluster be handling key management? By design, the current implementation will never encapsulate connection strings in serialized storage objects; rather it is expected that connection strings are specified at the configuration level of each node. This happens in order to avoid inadvertent leaks of connection strings to exported serializations of object graphs, which is very easy to occur. Should the user decide to introduce a new connection string from the client side, how will that key be distributed across the cluster without worrying that leaks might happen?
  2. Issues of cluster identity: at the moment every MBrace cluster is uniquely identified by the pair of storage and service bus accounts that it uses. How could we design frictionless introduction of secondary keys without potentially blurring this identity? And how can we be sure that those secondary keys are recoverable in cases where all worker instances have died?

@eiriktsarpalis
Copy link
Member

There are quite a few ways we could address these concerns: One would be maintain an "accounts" table in the master storage account which would contain all secondary connection strings. I do feel though that this may violate security expectations users may have.

@eiriktsarpalis
Copy link
Member

Another would be to use the service bus to broadcast additional auth data to workers.

@dsyme
Copy link
Contributor

dsyme commented Jun 8, 2017

See mbraceproject/MBrace.Azure#161 which I think covers this enough for these purposes (an MBrace.Core PR may follow out of that)

@dsyme dsyme closed this as completed Jun 8, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants