Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libcontainer: use OwnedFd as console_socket in ContainerBuilder #2966

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

abel-von
Copy link

We found we will get a residual fd in the process to call libcontainer, after we called exec of a container and exit.
and after some investigation, I found it is because the opened console_socket is not closed after exec build.

Copy link
Contributor

@jprendes jprendes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this @abel-von .
Two of the tests in tty.rs are failing due to these changes.
Would you mind fixing them?

@@ -233,7 +221,7 @@ mod tests {
let lis = UnixListener::bind(&socket_path);
assert!(lis.is_ok());
let fd = setup_console_socket(testdir.path(), &socket_path, CONSOLE_SOCKET);
let status = setup_console(&fd.unwrap());
let status = setup_console(fd.unwrap().as_raw_fd());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is failing as setup_console is closing the RawFd it receives.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, changed to into_raw_fd

csocketfd.as_raw_fd(),
&socket::UnixAddr::new(linked.as_path()).map_err(|err| TTYError::InvalidSocketName {
source: err,
socket_name: socket_name.to_string(),
})?,
) {
Err(Errno::ENOENT) => -1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic of this branch (and checked by the test_setup_console_socket_empty test) is lost here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I don't quite get why is this error handled independently, Could you mind explain why do we have to return -1 when ENOENT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This avoids failing in the case where the file is not present. I'm not sure if this is part of the spec requirements, or just copying what runc does, or just happens to be an implementation detail of youki.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, How about returning error when the file is not present? I don't think it will succeed if the init process send the tty fd through a socket of fd -1.
It is difficult to modify it if this function return OwnedFd, I think we can't new an OwnedFd from a rawfd of -1, because it may panic if the drop of OwnedFd is called, and we will call close of an fd of -1.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get the background of why does it return fd of -1 here? It is a little wierd because I don't find any similar logic in runc.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey so I went around in the git blame and tried to find out why this is specifically -1 , but from what I see, the -1 comes directly from the first commit (!) , and is kept as-is for no specific reason. Checking the code there, I think the intention was maybe to conform to the standard error policy (returning -1 on error) , but not sure. What I can see is that in that commit, this will just push error to a later time, when calling sendmsg with socket_fd of -1. Apart from that atleast I cannot find any reason in that or today's code why we specifically set it to -1 rather than return an error, because the code will error when it tries to use this raw fd for sendmsg. I think the correct way to proceed would be another analysis by someone else to check what I'm saying is right or not, and if it is, then replacing the -1 by the general error case.

@utam0k , correct me if I'm wrong here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that it came from railcar. -1 doesn't have a special reason. I'm afraid I've confused you... I'll note the person who first committed (that's me) 🫠
https://github.com/oracle/railcar/blob/ef5918e21e7ad9ffd25c1a507df96458ec4e0c24/src/main.rs#L539-L549

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, Can we say that it is ok to return error rather than -1?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abel-von , yes I think we can just merge that into the error case.

@YJDoc2
Copy link
Collaborator

YJDoc2 commented Oct 30, 2024

Hey @abel-von , a separate question - how does this relate to your other PR (#2961) ? Or are these two independent?

@abel-von
Copy link
Author

Hi @YJDoc2, I think they are independent, we found these issues because we are trying to use libcontainer instead of calling runc binary to manage containers. In our senarios, we have one process in a machine to manage all the containers running on it, so when we call libcontainer to manage containers, we have to make sure no resource residual in this process when a container is removed.

@abel-von
Copy link
Author

abel-von commented Nov 1, 2024

Two of the tests in tty.rs are failing due to these changes.

Fixed the test if we agree that setup_console_socket return error if the file does not exist. PTAL @jprendes

Comment on lines +79 to +86
struct CurrentDirGuard {
path: PathBuf,
}
impl Drop for CurrentDirGuard {
fn drop(&mut self) {
let _ = env::set_current_dir(&self.path);
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just comment: I haven't seen this pattern before, seems a good way to manage this kind of thing locally!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a crate called scopeguard, which contains a defer! macro, can handle this more elegantly. I just don't want to introduce the dependency.

@YJDoc2 YJDoc2 added kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. breaking change labels Nov 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants