-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added multithreading support for mkcomposefs #269
Added multithreading support for mkcomposefs #269
Conversation
73f7dc3
to
c79e986
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for working on this! This may take some time to review as multithreading and C is tricky.
Can you please run clang-format
? Basically let's get past the superficial bits before we can dig into the threading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Marking requested-changes to start just for formatting/style
30dc189
to
f2db67a
Compare
Thank you for checking. I have run clang-format on my changes via vscode formatting. Is that what you meant? |
Sorry for the inconvenience one header file was missed out from formatting. I have corrected that as well. Hopefully this solves all the formatting issues from my changeset. |
I'll have a look at the code, but can you please rebase this and squash the fixups so the end result is a minimal set of independent changes. Say one for the liibrary changes, and one of the mkcomposefs use of it. |
So, lots of comments. Make sure you read them all first, because some will make the other ones perhaps not needed. In particular, the propsed change to lcfs_node_set_from_content() will make most of the library API changes unnecessary. |
d833f84
to
a9ed30c
Compare
Thank you for the comments. I have addressed those and also squashed the commits to only 3 separate ones as you suggested. As i don´t know who should mark the comments as resolved, i just added reply to those. |
Thank you for reviewing it. I have addressed those and pushed the changes. |
Some more minor feedback, but I would like @cgwalters or @giuseppe to also review this. |
Also, if you're able to talk about it, I'd love to hear what you're using composefs for. Its good to know what your users are doing. |
c419ec2
to
9cdc556
Compare
Can you give some details @https://github.com/r0l1 ? He is the right person to comment on this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left a comment, could you please sign the commits with your real name?
My real name is Divin and i have signed off the commits with it. Did you mean including initials Divin OA or even initials expanded Divin Ookken Athappan? |
yes please use the expanded version |
Yes, separate PRs for things like this that are logically separate please.
… Thank you for the links. Reading through
https://lwn.net/Articles/846403/ i think it is safer to rollback
`copy_file_range` to older `read/write` or perhaps address this as a
separate PR as this does not have a direct dependency to the original
multi-threading change. What do you say? I am very new to linux and i
was looking for better apis to copy a file without user mode buffer.
|
I will create a separate PR for it. Can i also squash up the current commits in this PR to just 2 ? One for library and other for mkcomposefs. It was done before but then as part of review comments another set of 4 were added again. |
Yes, I think one or two commits here is best. |
I'm assuming your goal is faster The downside is of course that if something happens to mutate a file in the cache, you get incorrect checksums. But this is pretty easily mitigated, and actually a great thing with composefs is such a failure should be detected at runtime. (These are complementary approaches to be clear; threading also makes sense) |
…mposefs Signed-off-by: Divin Ookken Athappan <[email protected]>
Signed-off-by: Divin Ookken Athappan <[email protected]>
9cdc556
to
b49499b
Compare
Squashed the commits to just 2 (one for library and one for mkcomposefs) and siged it off with my full name. Added an issue #274 for |
@alexlarsson we are using composefs for our custom OS and as docker container management layer. We have a small go initramfs including composefs and created a simple A/B OS which powers our devices at Wahtari and nLine. The devices are interlinked with a simple mesh like end-to-end encrypted network and this enables us to push updates (composefs store diff). I have some ideas how to improve the composefs store management and would like to discuss those ideas soon. I really appreciate your work and follow your ideas since Glick2. @cgwalters yes, this PR is for an internal speedup of the OS build process. Thanks for pointing out the cache idea. |
@r0l1 That sounds cool! Nice that it works for you. |
Thank you for these alternative-approaches. My objective was to make mkcomposefs run fast in any machine as part of our build process. It was simple and straight forward for me to add threads in mkcomposefs which does not require any facilitation from the end user. I am new to linux tooling and I would require some consultations here to fully understand these alternative approaches. |
This only works though if you image compose produces hardlinked copies of files you previously committed (otherwise the device/inode of the new files will not match something old in the cache). This happens regularly in ostree because of how it is set up, but that isn't necessarily always true, and it causes some level of extra pain to enforce when you build using ostree. (With things like rofiles-fuse, etc.) Not saying it is bad, but its not always applicable. But anyway, if you do use a setup like that where you end up with hardlinked files, then the best way to store and quickly access the cached digest is probably to just enable fs-verity on these files. In other words, we should probably have lcfs_node_set_from_content() start by asking the kernel for the fs-verity digest in case it is set, rather than computing it ourselves. |
Yes agreed, this is an obvious step. |
I think this looks good enough now. I looked through it again and did some local tests, seems to work well, and the public library APIs are sane. |
This resolves #249 and results in 10x improvement for digest calculation and file copy.
Please refer the below document for details about the issue and the change. I would like to know your thoughts.
threading.md