Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nix-prefetch-url doesn't use the fetcher cache #10881

Open
fricklerhandwerk opened this issue Jun 10, 2024 · 13 comments
Open

nix-prefetch-url doesn't use the fetcher cache #10881

fricklerhandwerk opened this issue Jun 10, 2024 · 13 comments
Labels
bug cli The old and/or new command line interface fetching Networking with the outside (non-Nix) world, input locking

Comments

@fricklerhandwerk
Copy link
Contributor

Describe the bug

nix-prefetch-url re-downloads the same remote object every time, because it doesn't use the fetcher cache.

Steps To Reproduce

nix-prefetch-url https://github.com/NixOS/nix/archive/refs/heads/master.zip
nix-prefetch-url https://github.com/NixOS/nix/archive/refs/heads/master.zip

Expected behavior

The second invocation does not require a download. But as you observe, it currently does.

Additional context

This is an issue for tools like npins: andir/npins#57

Priorities

Add 👍 to issues you find important.

@fricklerhandwerk fricklerhandwerk added bug cli The old and/or new command line interface fetching Networking with the outside (non-Nix) world, input locking labels Jun 10, 2024
@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/lock-files-without-flakes-what-is-the-status/58297/7

@roberth roberth added this to Nix team Jan 5, 2025
@github-project-automation github-project-automation bot moved this to To triage in Nix team Jan 5, 2025
@roberth
Copy link
Member

roberth commented Jan 5, 2025

nix-prefetch-url does add your example URL to the store, unlike what is said in andir/npins#57.

What if master changed between the two calls? I suppose it could use the tarball-ttl setting to cache it, but I'd prefer for it to use the etag header, or perhaps last-modified. (GitHub seems to provide etag in this case)

@eclairevoyant
Copy link
Contributor

eclairevoyant commented Jan 5, 2025

What if master changed between the two calls?

It didn't. And even if it did, the whole point of npins is to, well. Pin. So that shouldn't matter in that tool's scenario.

@roberth
Copy link
Member

roberth commented Jan 5, 2025

nix-prefetch-url https://github.com/NixOS/nix/archive/refs/heads/master.zip
nix-prefetch-url https://github.com/NixOS/nix/archive/refs/heads/master.zip

These aren't URLs pointing to immutable data though, so doesn't that mean that github.com should be queried again, and certainly so after tarball-ttl expired?

Suppose you lock your pins, and then you realize you forgot to push something, you push it, and then lock your pins again to pick up the update. For this scenario to work, you'd need nix-prefetch-url to perform at least a HEAD request to the URL.

I'm not arguing against using the fetcher cache; just against relying on tarball-ttl, because that would fail to update your pin in that scenario.

@eclairevoyant
Copy link
Contributor

eclairevoyant commented Jan 6, 2025

To be fair, were I to file a report, I'd have written it a bit differently. tarball-ttl aside, ultimately the url in the npins lockfile points at a fixed URL.

For example, I mentioned a pin of google/fonts, which in my case points at https://github.com/google/fonts/archive/cc8f3b4d3645c04030dfdccafa6d8514a0b8ba2c.tar.gz. The full pin looks like

    "google-fonts": {
      "type": "Git",
      "repository": {
        "type": "GitHub",
        "owner": "google",
        "repo": "fonts"
      },
      "branch": "main",
      "revision": "cc8f3b4d3645c04030dfdccafa6d8514a0b8ba2c",
      "url": "https://github.com/google/fonts/archive/cc8f3b4d3645c04030dfdccafa6d8514a0b8ba2c.tar.gz",
      "hash": "0bamh7hj4vplxwqw5ph0vff4ylc5dfm233f6alwk8w59sqikp0ag",
      "frozen": true
    },

When pinning, npins looks for the tarball (assuming I'm not using submodules, which I'm not):
https://github.com/andir/npins/blob/4dbde2a195aa86a8833a81ec5395ebfc7b7428e6/src/git.rs#L146-L155

It then, as far as I see, makes a call to nix-prefetch-url, of the form nix-prefetch-url --unpack --type sha256 <url>:

https://github.com/andir/npins/blob/4dbde2a195aa86a8833a81ec5395ebfc7b7428e6/src/nix.rs#L9-L18

Finally it generates npins/default.nix containing an equivalent fetcher call, using fetchTarball:

https://github.com/andir/npins/blob/4dbde2a195aa86a8833a81ec5395ebfc7b7428e6/src/default.nix#L69-L75

    # At the moment, either it is a plain git repository (which has an url), or it is a GitHub/GitLab repository
    # In the latter case, there we will always be an url to the tarball
    if url != null && !submodules then
      builtins.fetchTarball {
        inherit url;
        sha256 = hash; # FIXME: check nix version & use SRI hashes
      }

Naturally this URL should not change between the time that npins update completes and me building a config that depends on that. Therefore, nix-prefetch-url should ultimately cache the source.

If nix is fact caching, as you claim, then it seems odd to me that both nix-prefetch-url and fetchTarball should result in the same NAR hash, yet the tarball is fetched again when fetchTarball is called after nix-prefetch-url with no changes to the source in between. Although I really don't know how builtin fetchers work, I'm only familiar with the nixpkgs fetchers - I assumed that, like FODs, any existing store object would not get re-fetched. (And I also assumed that, based on "prefetch" in the name that it would be doing more or less that.)

@fricklerhandwerk
Copy link
Contributor Author

fricklerhandwerk commented Jan 6, 2025

nix-prefetch-url merely takes a different code path that downloads contents directly, while the builtin fetchers first check the fetcher cache.

Yes, in the bug report I use a somewhat arbitrary URL for an example, but as @eclairevoyant points out it doesn't make a difference for the underlying problem.

@roberth
Copy link
Member

roberth commented Jan 6, 2025

nix-prefetch-url + fetchTarball is locking and then fetching, whereas
nix-prefetch-url + nix-prefetch-url is locking and then locking again. (hence the question)

Judging by @fricklerhandwerk's response we're not actually off-topic when discussing the prior.
It seems that the problem is with fetching, not locking, and that makes the fetcher cache largely irrelevant.

@eclairevoyant do the store path names match? Are the contents of the store path created by nix-prefetch-url and fetchTarball identical?

@eclairevoyant
Copy link
Contributor

eclairevoyant commented Jan 6, 2025

$ nix-prefetch-url --unpack --type sha256 'https://github.com/google/fonts/archive/cc8f3b4d3645c04030dfdccafa6d8514a0b8ba2c.tar.gz'
path is '/nix/store/myc8qavfqzgip57vzd65zdfgflk0iykp-cc8f3b4d3645c04030dfdccafa6d8514a0b8ba2c.tar.gz'
0bamh7hj4vplxwqw5ph0vff4ylc5dfm233f6alwk8w59sqikp0ag

$ nix-instantiate --eval --expr 'builtins.fetchTarball { url = "https://github.com/google/fonts/archive/cc8f3b4d3645c04030dfdccafa6d8514a0b8ba2c.tar.gz"; sha256 = "0bamh7hj4vplxwqw5ph0vff4ylc5dfm233f6alwk8w59sqikp0ag"; }'
"/nix/store/d4r7alk6d06csxkdxja5z9js76xqc98v-source"

So (by default) they are different paths despite resulting in the same hash, hence the cache miss... though they are identical - diffoscope /nix/store/myc8qavfqzgip57vzd65zdfgflk0iykp-cc8f3b4d3645c04030dfdccafa6d8514a0b8ba2c.tar.gz /nix/store/d4r7alk6d06csxkdxja5z9js76xqc98v-source results in no output.

I guess the simple fix here would be on the npins side to explicitly set name.

EDIT: Though, it's a little surprising to me that they used different formats?

@eclairevoyant
Copy link
Contributor

According to the nix reference manual:

Since fetchurl uses the same name for the downloaded file as nix-prefetch-url, the redundant download can be avoided.

So I tested with not-unpacking, and they do in fact match:

$ nix-prefetch-url --type sha256 'https://github.com/google/fonts/archive/cc8f3b4d3645c04030dfdccafa6d8514a0b8ba2c.tar.gz' 
path is '/nix/store/bfgypqhy1n64rrz66wgjf38y4kic663y-cc8f3b4d3645c04030dfdccafa6d8514a0b8ba2c.tar.gz'
1z9z4rplivhdk09dv82h8791f8drlxlszlg7j2kbrc3vdk9cvx0b

$ nix-instantiate --eval --expr 'builtins.fetchurl { url = "https://github.com/google/fonts/archive/cc8f3b4d3645c04030dfdccafa6d8514a0b8ba2c.tar.gz"; sha256 = "1z9z4rplivhdk09dv82h8791f8drlxlszlg7j2kbrc3vdk9cvx0b"; }'
"/nix/store/bfgypqhy1n64rrz66wgjf38y4kic663y-cc8f3b4d3645c04030dfdccafa6d8514a0b8ba2c.tar.gz"

I can't think of a good reason for fetchTarball (and possibly other fetchers) to break this pattern.

@roberth
Copy link
Member

roberth commented Jan 6, 2025

So the mismatched store path name is the problem here.

reason to break pattern

"source" has become a convention for at least the following reasons.

  • ensure that we
    get a consistent store path regardless of how the tree is fetched
    65b5f17

  • In that case, the name will be the name of
    the checkout on the current machine, which is not necessarily
    reproducible across different settings
    cleanSourceWith: don't use baseNameOf nixpkgs#83201

Unfortunately we can't get rid of the old behavior because of Nix's promise to evaluate reproducibly.
The best we could do to avoid this confusion I think is to require an explicit --auto-name or --name flag when run interactively, when --unpack is passed, or perhaps always.

@eclairevoyant
Copy link
Contributor

eclairevoyant commented Jan 6, 2025

"source" has become a convention for at least the following reasons.

  • ensure that we
    get a consistent store path regardless of how the tree is fetched

Sure, I've seen that before, but somehow the name would have to change but have the output hash stay fixed. I've never seen this play out in practice, at least in nixpkgs. And if that was really a valuable proposition then fetchurl would use source too, and so would all the prefetchers.

Unfortunately we can't get rid of the old behavior because of Nix's promise to evaluate reproducibly.

Of course, I was only speaking in retrospect that the design seemed odd. I'm definitely not suggesting a fetchTarball2 here 😁

@Mic92
Copy link
Member

Mic92 commented Jan 8, 2025

@fricklerhandwerk did the explanations of @roberth solve your issue?

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2025-01-08-nix-team-meeting-minutes-207/58523/1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug cli The old and/or new command line interface fetching Networking with the outside (non-Nix) world, input locking
Projects
None yet
Development

No branches or pull requests

5 participants