-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for decoding erofs into lcfs_node, and some tooling using it #185
Conversation
538c031
to
b15de57
Compare
I updated the composefs-info escaping to be a bit more strict. The dump format is very simple and readable, yet efficiently streamable, I wonder if that is a good format for streaming data to mkcomposefs. Its maybe not super extensible though... |
b15de57
to
512aff6
Compare
512aff6
to
8fbe163
Compare
Rebased on latest and fixed some minor issues. This is still not ready. |
8fbe163
to
1d2e384
Compare
TODO:
|
Signed-off-by: Alexander Larsson <[email protected]>
Signed-off-by: Alexander Larsson <[email protected]>
trusted.overlay.verity doesn't exist anymore, its just trusted.overlay.metacopy. Signed-off-by: Alexander Larsson <[email protected]>
Signed-off-by: Alexander Larsson <[email protected]>
Signed-off-by: Alexander Larsson <[email protected]>
Right now these are only used from cfs-fuse.c, but they will be reused more later. Signed-off-by: Alexander Larsson <[email protected]>
Signed-off-by: Alexander Larsson <[email protected]>
Signed-off-by: Alexander Larsson <[email protected]>
This allows a whiteout in the composefs mount be used as a whiteout both with a regular and an userxattr overlayfs mount. Signed-off-by: Alexander Larsson <[email protected]>
We want to be able to read back everything that we can set. Signed-off-by: Alexander Larsson <[email protected]>
This is useful to figure out the content of a composefs iamge. Signed-off-by: Alexander Larsson <[email protected]>
This lets you list the content of an image, and dump the objects it references. Signed-off-by: Alexander Larsson <[email protected]>
For all the images we generate from json, try also to regenerated them and ensure we get the same bits. To do the dumping we add the internal composefs-dump tool. Signed-off-by: Alexander Larsson <[email protected]>
This lets you test lcfs_node_set_content() Signed-off-by: Alexander Larsson <[email protected]>
Signed-off-by: Alexander Larsson <[email protected]>
This tesets various special things like device nodes, escaped xattrs whiteouts and inlined files. Signed-off-by: Alexander Larsson <[email protected]>
This may catch any weird stuff we might generate. Signed-off-by: Alexander Larsson <[email protected]>
Now uses more of the new helpers and code. Also supports inlined data. Signed-off-by: Alexander Larsson <[email protected]>
Signed-off-by: Alexander Larsson <[email protected]>
1d2e384
to
fbdf4e4
Compare
I rebased on master, implemented missing features in fuse, generally cleaned up the code and added a few more tests. I think with this in we're ready to freeze the on-disk format. Once this is in I will focus on adding even more tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of code here, I gave this a semi-skim, only found one actual minor bug (that can be fixed in a followup even); the rest is just nits/questions.
res = malloc(prefix_len + name_len + 1); | ||
if (res == NULL) { | ||
errno = ENOMEM; | ||
return NULL; | ||
} | ||
memcpy(res, prefix, prefix_len); | ||
memcpy(res + prefix_len, name, name_len); | ||
res[prefix_len + name_len] = 0; | ||
|
||
return res; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is all just return asprintf("%s%.*s", prefix, name_len, name)
right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, but I'm not super found of asprintf like this, because I always have to look up the length modifier in the manpage to get it right.
And also, I always worry about the integer format conversion in varargs calls like this.
Like, name_len is size_t, which could be larger than int in some compilers, and the field width must be int. So, do we need to cast? etc.
return s2; | ||
} | ||
|
||
static inline char *str_join(const char *a, const char *b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optional: This is just code motion, so OK, but same question re asprintf
- can be a followup.
@@ -29,6 +29,28 @@ | |||
*/ | |||
#define LCFS_BUILD_INLINE_FILE_SIZE_LIMIT 64 | |||
|
|||
#define OVERLAY_XATTR_USER_PREFIX "user." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aweseome, I think this helps a lot to have these. Then we can also followup with some links to the kernel sources.
@@ -1116,12 +1116,20 @@ static int add_overlayfs_xattrs(struct lcfs_node_s *node) | |||
"", 0); | |||
if (ret < 0) | |||
return ret; | |||
ret = lcfs_node_set_xattr(node, OVERLAY_XATTR_USERXATTR_WHITEOUT, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hum, interesting...but can you help me understand this? Do we need the userxattrs for unprivileged overlayfs mounts? This seems tricky because it's redundant data in the privileged case, no? Or maybe it isn't much data and doesn't matter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, a traditional whiteout is a chardev(0,0), and having that in the lowerdir will work as a whiteout for both regular and userxattr overlayfs mounts. Unfortunately they don't work in composefs, because the outer overlayfs eats them.
So, we have a new type of whiteout. If we set an escaped whiteout xattr like trusted.overlay.overlay.whiteout
on a file in the erofs, then in the overlayfs mount it will show up as trusted.overlay.whiteout
, and a (new) regular overlayfs mount will treat it as a whiteout. However, to a userxattr overlayfs mount it will not be special in any way, because it looks for user.overlay.whiteout
files. So, composefs needs to set both of these for the resulting file to always work as a whiteout.
In terms of redundancy. Its not ideal, but it is what we need, and I don't think we'll have tons of whiteouts in practice.
|
||
node = lcfs_load_node_from_image(image_data, image_data_size); | ||
if (node == NULL) { | ||
errsv = errno; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a handy PROTECT_ERRNO
macro btw.
|
||
static void digest_to_string(const uint8_t *csum, char *buf) | ||
{ | ||
static const char hexchars[] = "0123456789abcdef"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell, this would literally be the fourth copy of this in tree...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ooops :)
get_objects(node, ht); | ||
|
||
size_t n_objects = hash_get_n_entries(ht); | ||
char **objects = calloc(n_objects, sizeof(char *)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing NULL
check
const char *bin = argv[0]; | ||
int fd; | ||
cleanup_node struct lcfs_node_s *root = NULL; | ||
const char *image_path = NULL; | ||
const char *command; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So not really related to this exactly but...can I arm twist you into using declare-and-initialize style in this codebase? I find it much more readable than "big dump of variables at the top of the function". It plays more nicely with __attribute__(cleanup)
too because there's no footgun around forgetting to init to NULL
etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
const erofs_inode *cino = cfs_get_erofs_inode(ino); | ||
uint32_t mode; | ||
uint64_t file_size; | ||
uint16_t xattr_icount; | ||
size_t xattr_size; | ||
size_t isize; | ||
uint64_t n_blocks; | ||
uint64_t last_oob_block; | ||
bool tailpacked; | ||
size_t tail_size; | ||
uint32_t raw_blkaddr; | ||
off_t oob_size; | ||
const uint8_t *tail_data; | ||
const uint8_t *oob_data; | ||
struct iovec iov[2]; | ||
int i; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(IMO, this is crying out for declare-and-initialize...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not to completely belabor this point but...basically all other sane modern languages allow declaring (and initializing) a variable inside a function.
In the case of this particular function, it would probably be half the size if done this way.
I know you work on the Linux kernel which isn't yet using this style, and I know even some others like systemd haven't really adopted it but IMO it really just makes things much more readable. Particularly in this example!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, the kernel is moving toward using attribute(cleanup), and as a result will be moving towards this style too (at least partially)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just reading the code and I have a mental checklist item for "invoking open without O_CLOEXEC" that triggered here. (See also e.g. containers/composefs#185 (comment) ) It has security-relevant properties for us, xref CVE-2024-21626 for example. This isn't the only missing variant of this in this codebase, just using this targeted PR to test the waters for more PRs. Signed-off-by: Colin Walters <[email protected]>
I was just reading the code and I have a mental checklist item for "invoking open without O_CLOEXEC" that triggered here. (See also e.g. containers/composefs#185 (comment) ) It has security-relevant properties for us, xref CVE-2024-21626 for example. This isn't the only missing variant of this in this codebase, just using this targeted PR to test the waters for more PRs. Signed-off-by: Colin Walters <[email protected]>
I was just reading the code and I have a mental checklist item for "invoking open without O_CLOEXEC" that triggered here. (See also e.g. containers/composefs#185 (comment) ) It has security-relevant properties for us, xref CVE-2024-21626 for example. This isn't the only missing variant of this in this codebase, just using this targeted PR to test the waters for more PRs. Signed-off-by: Colin Walters <[email protected]>
This will help with fixing
#182