Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data cloud #2

Open
emphasis87 opened this issue Mar 16, 2017 · 0 comments
Open

Data cloud #2

emphasis87 opened this issue Mar 16, 2017 · 0 comments
Labels

Comments

@emphasis87
Copy link
Owner

emphasis87 commented Mar 16, 2017

Decide techniques used to transfer data required for jobs between clients. Bittorrent seems promising, but I am not sure as to how fast is initialization and how costly is peer searching and communication overhead. Also check how Hadoop does it. How saturated should a mesh be? There should be some mechanism for labeling data with rareness or reproducibility and cost. This can perhaps be determined by run jobs' requirements to some extent. How should data be addressed and catalogued? Also, NSQ uses snappy or deflate for message compression, but perhaps zstd may perform better, or decide based on data usage frequency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant