When using advanced storage strategy, why copy the peer data file into the same dir as the output file? #1683

embroede · 2022-09-16T18:09:46Z

Per the documentation, when using the io.d7y.storage.v2.advance storage strategy, the peer data file is copied into the same directory as the output file.

After running dfget <url> -O /tmp/eddie_test, I have observed that there are actually 3 hard links to the file. They are:

/tmp/eddie_test
/tmp/.eddie_test.dfget.cache.<req.PeerID>
<dataDir>/<req.TaskID>/<req.PeerID>/data

Why not just copy the file to the dataDir, and link from there?

The text was updated successfully, but these errors were encountered:

gaius-qi · 2022-09-19T04:21:52Z

To avoid copying the daemon cache across filesystems to the specified directory.

jim3ma · 2022-09-26T01:10:13Z

Hardlink is fast than copying the file. io.d7y.storage.v2.simple storage strategy will copy the file.

embroede · 2022-09-26T21:13:56Z

Yep I definitely like the hard link approach. But I don't see why links need to exist in <dataDir> and in the output path.

If the <dataDir> is on a different filesystem, I believe we could use a symlink (and I see there is code already to do this).

So just download straight to <dataDir>, and then either hard link or symlink to the output path?

jim3ma · 2022-09-27T14:40:30Z

The strategy io.d7y.storage.v2.simple will make symlink if is on different filesystems

embroede · 2022-09-27T15:00:42Z

It appears that in https://github.com/dragonflyoss/Dragonfly2/blob/main/client/daemon/storage/storage_manager.go#L454 the symlink is done as a fallback if the hard link fails, when using io.d7y.storage.v2.advance.

embroede · 2022-10-31T19:40:11Z

I just updated my comment above, as my <data_dir> wasn't in backticks, so was being hidden.

embroede · 2023-03-23T04:43:58Z

@jim3ma @gaius-qi To clarify, what I'd like to know is: Why is it not sufficient to download the file to the dataDir, and then link (either hardlink or symlink) to it?

Why do we need /tmp/.eddie_test.dfget.cache.<req.PeerID>?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When using advanced storage strategy, why copy the peer data file into the same dir as the output file? #1683

When using advanced storage strategy, why copy the peer data file into the same dir as the output file? #1683

embroede commented Sep 16, 2022

gaius-qi commented Sep 19, 2022

jim3ma commented Sep 26, 2022

embroede commented Sep 26, 2022 •

edited

Loading

jim3ma commented Sep 27, 2022

embroede commented Sep 27, 2022

embroede commented Oct 31, 2022

embroede commented Mar 23, 2023

When using advanced storage strategy, why copy the peer data file into the same dir as the output file? #1683

When using advanced storage strategy, why copy the peer data file into the same dir as the output file? #1683

Comments

embroede commented Sep 16, 2022

gaius-qi commented Sep 19, 2022

jim3ma commented Sep 26, 2022

embroede commented Sep 26, 2022 • edited Loading

jim3ma commented Sep 27, 2022

embroede commented Sep 27, 2022

embroede commented Oct 31, 2022

embroede commented Mar 23, 2023

embroede commented Sep 26, 2022 •

edited

Loading