Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CNIConfigWriter writes .temp file that throws off containerd #42

Open
dims opened this issue Jul 20, 2024 · 14 comments
Open

CNIConfigWriter writes .temp file that throws off containerd #42

dims opened this issue Jul 20, 2024 · 14 comments

Comments

@dims
Copy link

dims commented Jul 20, 2024

We write a conflist.temp file here:
https://github.com/aojea/kindnet/blob/main/cmd/kindnetd/cni.go#L175-L181

which causes errors in containerd like so:

Jul 20 07:35:36 ip-172-31-92-72.ec2.internal containerd[4169]: time="2024-07-20T07:35:36.181108954Z" level=error msg="failed to reload cni configuration after receiving fs change event(WRITE         \"/etc/cni/net.d/10-kindnet.conflist.temp\")" error="cni config load failed: no network config found in /etc/cni/net.d: cni plugin not initialized: failed to load cni config"
Jul 20 07:35:36 ip-172-31-92-72.ec2.internal containerd[4169]: time="2024-07-20T07:35:36.181177675Z" level=error msg="failed to reload cni configuration after receiving fs change event(WRITE         \"/etc/cni/net.d/10-kindnet.conflist.temp\")" error="cni config load failed: no network config found in /etc/cni/net.d: cni plugin not initialized: failed to load cni config"
Jul 20 07:35:36 ip-172-31-92-72.ec2.internal containerd[4169]: time="2024-07-20T07:35:36.181227840Z" level=error msg="failed to reload cni configuration after receiving fs change event(WRITE         \"/etc/cni/net.d/10-kindnet.conflist.temp\")" error="cni config load failed: no network config found in /etc/cni/net.d: cni plugin not initialized: failed to load cni config"

because containerd/cri code does not filter them out:
https://github.com/containerd/containerd/blob/da3655678d4fd46b6dc6ca9721375da01b90b2f6/internal/cri/server/cni_conf_syncer.go#L100-L105

@aojea
Copy link
Owner

aojea commented Jul 20, 2024

The errors are harmless because the file will be replaced later to make an atomic write.
The problem of using a temporal file is that we hit this error

2024-07-20T13:32:26.755832582Z stderr F I0720 13:32:26.755775 1 main.go:252] Failed to reconcile routes, retrying after error: rename /tmp/kindnet.cni.temp14756740 /etc/cni/net.d/10-kindnet.conflist: invalid cross-device link

@dims
Copy link
Author

dims commented Jul 20, 2024

whew, ack. definitely good to get rid of the "error" if possible

@aojea
Copy link
Owner

aojea commented Jul 20, 2024

is this a new behavior in containerd @MikeZappa87 @mikebrown ?

the go-cni library already skips the not known extensions

kindnet/cmd/kindnetd/cni.go

Lines 175 to 180 in de6a072

// use an extension not recognized by CNI to write the contents initially
// https://github.com/containerd/go-cni/blob/891c2a41e18144b2d7921f971d6c9789a68046b2/opts.go#L170
// then we can rename to atomically make the file appear
f, err := os.Create(c.path + ".temp")
if err != nil {
return err

@MikeZappa87
Copy link

MikeZappa87 commented Jul 20, 2024

This is because of the file watcher, any changes to the specified directory would cause this. We probably should downgrade the error to warn or filter on specific file extensions (.conflist, .json, .conf) since we could have legitimate errors loading. We could embed this logic into go-cni or a function inside the cni conf syncer file. Its rather simple to just add a check prior the Load.

Let me know/open to ideas.

Quick fix?
https://github.com/containerd/containerd/compare/mzappa/filtercniconfigevents?expand=1

@dims
Copy link
Author

dims commented Jul 21, 2024

@MikeZappa87 the quick fix looks good to me!

@MikeZappa87
Copy link

@dims it probably makes sense to update the documentation in containerd to reflect that we only would watch these specific file types

@dims
Copy link
Author

dims commented Jul 23, 2024

/close

@BenTheElder
Copy link

Thanks @MikeZappa87 !

@MikeZappa87
Copy link

I just need to test this PR (containerd/containerd#10497)

@mikebrow
Copy link

mikebrow commented Jul 25, 2024

My read of the go-cni code is we filter for the aforementioned extensions when loading all config from a directory...

However there is no filtering by extension if loading by an explicit filename path. Thus creating a bit of a quandary, in so far as someone using https://github.com/containerd/go-cni/blob/1c1be5e9ea864c9bc1651909ae13e1e555b4098d/opts.go#L135 which calls https://github.com/containernetworking/cni/blob/main/libcni/conf.go#L280 won't care what the file name extensions are.

@MikeZappa87
Copy link

My read of the go-cni code is we filter for the aforementioned extensions when loading all config from a directory...

However there is no filtering by extension if loading by an explicit filename path. Thus creating a bit of a quandary.

In this case the directory watcher is in containerd and not go-cni though. When you say loading by an explicit filename path is that saying something other than .conflist, .conf or .json?

@mikebrow
Copy link

fortunately it does look like we only load from the directory in cri.. and thus can filter the cri sync update process to extensions..

Just pointing out there does not seem to be a hard n fast extensions rule, more a convention.

@mikebrow
Copy link

My read of the go-cni code is we filter for the aforementioned extensions when loading all config from a directory...
However there is no filtering by extension if loading by an explicit filename path. Thus creating a bit of a quandary.

In this case the directory watcher is in containerd and not go-cni though. When you say loading by an explicit filename path is that saying something other than .conflist, .conf or .json?

nod.. for the CRI plugin use cases we seem to be ok to filter to known extensions in the configured directory as we use the With dir approach vs With file path..

@MikeZappa87
Copy link

Does it make sense to have a function in go-cni that will return a slice of 'supported' extensions? That can be used in go-cni and then also from the cri side? This prevents the hardcoding of file extensions in two places. Possibly an over optimization at this point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants