Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What are your thoughs about integrating distributed stores into Nix? #113

Open
lucasew opened this issue Oct 2, 2024 · 5 comments
Open
Labels
question Further information is requested

Comments

@lucasew
Copy link

lucasew commented Oct 2, 2024

Question

AFAIK the most well known implementation is ipfs-nix from Obsidian systems that aim to integrate IPFS into Nix.

I, at least in the current state, am a bit skeptical about how the adoption would be. A full bump of my system would pull 10s of GBs of small artifacts, and if you have ever downloaded the closure of LaTeX packages you realized that lots of small operations == slow. I tried myself to make a RFC around exposing IPFS-compatible hashes in a narinfo so users could alternatively use IPFS to download stuff from binary caches but I basically abandoned the idea because I, myself, didn't think it would make much sense. The process would be so slow + the overhead of having to have stuff both in the nix store and IPFS store. I would only use something like this if I wouldn't need to duplicate stuff.

What are your thoughts about these projects and initiatives?

Candidates I'd like to get an answer from

No response

Reminder of the Q&A rules

Please adhere to the Q&A guidelines and rules

@lucasew lucasew added the question Further information is requested label Oct 2, 2024
@numinit
Copy link

numinit commented Oct 3, 2024

Some friends and I got part of the way through an implementation of this for doing mesh binary caches this year at DEF CON. It's really interesting stuff (that's been on hold recently due to life happening, but stay tuned). Setting aside the head of line blocking for checking multiple stores in a mesh configuration, we decided that a distributed store would probably have to be used as an assist to existing unicast HTTP stores for this exact reason, but could alleviate some bandwidth issues.

@tomberek
Copy link

tomberek commented Oct 6, 2024

The Store layer of Nix is a quite powerful concept and I'd love to see more exploration of additional variations. I suspect there are some niche use-cases and some that would be generally useful. Specifically about ipfs-nix, I think we need to see a "killer app" to really get it adopted.

@nyabinary
Copy link
Contributor

I am quite interested in exploring the possibilities of integrating distributed stores like IPFS into Nix. While I understand some concerns regarding performance and redundancy, I believe IPFS could still play a valuable role in increasing decentralization and resilience within Nix's infrastructure. However, careful consideration would be needed to address the challenges raised, such as system overhead and implementation complexity.

@proofconstruction
Copy link
Contributor

I think distributed stores are a great idea for several reasons, but here are my top two:

  • it could help take some of the burden off the S3 cache by moving less-used artifacts out to the community actually using them, and this would in turn allow us to reduce spending considerably
  • in an era of rapidly intensifying climate change, decentralization will be ever more important; distributing (parts of) the cache will make it possible for isolated communities (e.g. Nix users in Appalachia right now) to continue working even without connectivity to the wider internet.

@mschwaig
Copy link
Member

mschwaig commented Oct 7, 2024

I think it depends on what issue we are trying to solve.

If we want to speed up downloads and reduce bandwidth costs, that makes me look at bittorrent, which as been widely and successfully used for these purposes for a long time. So I have a strong feeling that on a technical level this should be very possible. We may not want to rely on if for small outputs and for determining cache hits, and we would have to analyze the security implications around some aspects of it. The gains could still be significant, especially when we think about local networks.

An actual distributed store, which we also rely on for persistent storage, I think is more difficult. IPFS may or may not be the right technical basis for this, based on how the protocols work, and how it is implemented today. I can see how the current implementation of IPFS leading us to store things twice is a huge issue.

@NixOS NixOS locked as resolved and limited conversation to collaborators Oct 7, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

6 participants