Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

config-manager: Only use tedge-write after normal copy fails due to permissions and improve tedge-write logging #3069

Merged
merged 3 commits into from
Aug 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion crates/common/tedge_config/src/sudo.rs
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ impl SudoCommandBuilder {
/// prepare a [`Command`](std::process::Command) using sudo. Otherwise,
/// prepares a Command that starts `program` directly.
pub fn command<S: AsRef<OsStr>>(&self, program: S) -> Command {
let program = program.as_ref();
if !self.enabled {
return Command::new(program);
}
Expand All @@ -57,7 +58,7 @@ impl SudoCommandBuilder {
c
}
Err(_) => {
warn!("`sudo.enable` set to `true`, but sudo not found in $PATH");
warn!("`sudo.enable` set to `true`, but sudo not found in $PATH, invoking '{}' directly", program.to_string_lossy());
Command::new(program)
}
}
Expand Down
28 changes: 19 additions & 9 deletions crates/core/tedge_write/src/api.rs
Original file line number Diff line number Diff line change
Expand Up @@ -33,24 +33,34 @@ impl<'a> CopyOptions<'a> {
///
/// Stdin and Stdout are UTF-8.
pub fn copy(self) -> anyhow::Result<()> {
let output = self
.command()?
.output()
.context("Starting tedge-write process failed")?;
let mut command = self.command()?;

let output = command.output();

let program = command.get_program().to_string_lossy();
let output = output.with_context(|| format!("failed to start process '{program}'"))?;

if !output.status.success() {
return Err(anyhow!(
String::from_utf8(output.stderr).expect("output should be utf-8")
));
let stderr = String::from_utf8_lossy(&output.stderr);
let stderr = stderr.trim();
let err = match output.status.code() {
Some(exit_code) => anyhow!(
"process '{program}' returned non-zero exit code ({exit_code}); stderr=\"{stderr}\""
),
None => anyhow!("process '{program}' was terminated; stderr=\"{stderr}\""),
};

return Err(err);
}

Ok(())
}

fn command(&self) -> std::io::Result<Command> {
fn command(&self) -> anyhow::Result<Command> {
let mut command = self.sudo.command(crate::TEDGE_WRITE_PATH);

let from_reader = std::fs::File::open(self.from)?;
let from_reader = std::fs::File::open(self.from)
.with_context(|| format!("could not open file for reading '{}'", self.from))?;
command.stdin(from_reader).arg(self.to);

if let Some(mode) = self.mode {
Expand Down
2 changes: 2 additions & 0 deletions crates/extensions/tedge_config_manager/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,10 @@ tedge_file_system_ext = { workspace = true }
tedge_mqtt_ext = { workspace = true }
tedge_uploader_ext = { workspace = true }
tedge_utils = { workspace = true }
tempfile = { workspace = true }
thiserror = { workspace = true }
toml = { workspace = true }
uzers = { workspace = true }

[dev-dependencies]
tedge_actors = { workspace = true, features = ["test-helpers"] }
Expand Down
190 changes: 179 additions & 11 deletions crates/extensions/tedge_config_manager/src/actor.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
use anyhow::Context;
use async_trait::async_trait;
use camino::Utf8Path;
use camino::Utf8PathBuf;
Expand All @@ -6,6 +7,10 @@ use log::error;
use log::info;
use serde_json::json;
use std::collections::HashMap;
use std::io::ErrorKind;
use std::os::unix::fs::fchown;
use std::os::unix::fs::MetadataExt;
use std::os::unix::fs::PermissionsExt;
use tedge_actors::fan_in_message_type;
use tedge_actors::Actor;
use tedge_actors::ChannelError;
Expand All @@ -32,6 +37,7 @@ use tedge_uploader_ext::UploadRequest;
use tedge_uploader_ext::UploadResult;
use tedge_write::CopyOptions;

use crate::FileEntry;
use crate::TedgeWriteStatus;

use super::config::PluginConfig;
Expand Down Expand Up @@ -379,7 +385,9 @@ impl ConfigManagerActor {
let response = match result {
Ok(response) => response,
Err(err) => {
let error_message = format!("config-manager failed downloading a file: {err}",);
let err =
anyhow::Error::from(err).context("config-manager failed downloading a file");
let error_message = format!("{err:#}");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{err:#} on anyhow::Error prints all the lower level causes in error 1: error 2: error 3 format.

request.failed(&error_message);
error!("{}", error_message);
self.publish_command_status(ConfigOperation::Update(topic, request))
Expand All @@ -401,6 +409,10 @@ impl ConfigManagerActor {
error!("{}", error_message);
self.publish_command_status(ConfigOperation::Update(topic, request))
.await?;

// TODO: source temporary file should be cleaned up automatically
let _ = std::fs::remove_file(from);
Comment on lines +413 to +414
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has definitely to be done.

But not necessarily here. This file has been created and populated outside this method. It should be automatically cleaned by the caller. Not easy to implement with the current design, though. Can this be done once merged this refactoring


return Ok(());
}
};
Expand All @@ -413,36 +425,55 @@ impl ConfigManagerActor {
self.publish_command_status(ConfigOperation::Update(topic, request))
.await?;

// TODO: source temporary file should be cleaned up automatically
let _ = std::fs::remove_file(from);

Ok(())
}

/// Deploys the new version of the configuration file and returns the path under which it was
/// deployed.
///
/// This function ensures that the configuration file under `dest` is overwritten by a new
/// version currently stored in a temporary directory under `src`. Depending on if
/// `use_tedge_write` is used, either a new `tedge-write` process is spawned, or a file is
/// copied directly.
/// Ensures that the configuration file under `dest` is overwritten atomically by a new version
/// currently stored in a temporary directory.
///
/// If the configuration file doesn't already exist, a new file with target permissions is
/// created. If the configuration file already exists, its content is overwritten, but owner and
/// mode remains unchanged.
///
/// If `use_tedge_write` is enabled, a `tedge-write` process is spawned when privilege elevation
/// is required.
fn deploy_config_file(
&self,
from: &Utf8Path,
config_type: &str,
) -> Result<Utf8PathBuf, ConfigManagementError> {
let file_entry = self.plugin_config.get_file_entry_from_type(config_type)?;

let mode = file_entry.file_permissions.mode;
let user = file_entry.file_permissions.user.as_deref();
let group = file_entry.file_permissions.group.as_deref();

let to = Utf8PathBuf::from(&file_entry.path);

let Err(err) = move_file_atomic_set_permissions_if_doesnt_exist(from, file_entry)
.with_context(|| format!("failed to deploy config file from '{from}' to '{to}'"))
else {
return Ok(to);
};

if let Some(io_error) = err.downcast_ref::<std::io::Error>() {
if io_error.kind() != ErrorKind::PermissionDenied {
return Err(err.into());
}
}

match self.config.use_tedge_write.clone() {
TedgeWriteStatus::Disabled => {
let src_file = std::fs::File::open(from)?;
tedge_utils::fs::atomically_write_file_sync(&to, src_file)?;
return Err(err.into());
}

TedgeWriteStatus::Enabled { sudo } => {
let mode = file_entry.file_permissions.mode;
let user = file_entry.file_permissions.user.as_deref();
let group = file_entry.file_permissions.group.as_deref();

let options = CopyOptions {
from,
to: to.as_path(),
Expand All @@ -451,6 +482,7 @@ impl ConfigManagerActor {
user,
group,
};

options.copy()?;
}
}
Expand Down Expand Up @@ -516,6 +548,142 @@ impl ConfigManagerActor {
}
}

/// Writes a file atomically and optionally sets its permissions.
///
/// Setting permissions (owner and group) of a file is a privileged operation so it needs to be run
/// as root. If the any of the filesystem operations fail due to not having permissions, the
/// function will return an error.
didier-wenzek marked this conversation as resolved.
Show resolved Hide resolved
///
/// If the file already exists, its content will be overwritten but its permissions will remain
/// unchanged.
///
/// For deployment of configuration files, we need to create a file atomically because certain
/// programs might watch configuration file for changes, so if it's not written atomically, then
/// file might be only partially written and a program trying to read it may crash.
///
/// Atomic write of a file consists of creating a temporary file in the same directory, filling it
/// with correct content and permissions, and only then renaming the temporary into the destination
/// filename. Because we're never actually writing into the file, we don't need to write permissions
/// for the destination file, even if it exists. Instead we need only write/execute permissions to
/// the directory file is located in unless the directory has a sticky bit set. Overwriting a file
/// will also change its uid/gid/mode, if writing process euid/egid is different from file's
/// uid/gid. To keep uid/gid the same, after the write we need to do `chown`, and to do it we need
/// sudo.
///
/// # Errors
///
/// - `src` doesn't exist
/// - user or group doesn't exist
/// - we have no write/execute permissions to the directory
fn move_file_atomic_set_permissions_if_doesnt_exist(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn move_file_atomic_set_permissions_if_doesnt_exist(
fn move_file_preserving_target_permissions(

Can we also move this API into tedge_utils so that it can be reused? After removing the dependency on FileEntry, of course. It could take the expected permissions instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I have this done for a follow-up PR

src: &Utf8Path,
file_entry: &FileEntry,
) -> anyhow::Result<()> {
let dest = Utf8Path::new(file_entry.path.as_str());

let target_permissions = config_file_target_permissions(file_entry, dest)
.context("failed to compute target permissions of the file")?;

let mut src_file = std::fs::File::open(src)
.with_context(|| format!("failed to open temporary source file '{src}'"))?;

// TODO: create tests to ensure writes we expect are atomic
let mut tempfile = tempfile::Builder::new()
.permissions(std::fs::Permissions::from_mode(0o600))
.tempfile_in(dest.parent().context("invalid path")?)
.with_context(|| format!("could not create temporary file at '{dest}'"))?;

std::io::copy(&mut src_file, &mut tempfile).context("failed to copy")?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function name implies move but we're doing a copy here. Can we attempt a file move at least when the source and target files are under the same parent directory?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, move is probably a little misleading for now, but this function will be used by both config-manager directly and by tedge-write, which reads from stdin, so the function will be renamed to write_... in a follow-up PR


tempfile
.as_file()
.set_permissions(std::fs::Permissions::from_mode(target_permissions.mode))
.context("failed to set mode on the destination file")?;

fchown(
tempfile.as_file(),
Some(target_permissions.uid),
Some(target_permissions.gid),
)
.context("failed to change ownership of the destination file")?;
Comment on lines +598 to +608
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could have been done with tedge_utils::file::PermissionEntry::apply() as well, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK PermissionEntry::apply() uses a chown, but since we have a file descriptor open and want all these operations finished before the final rename, I decided to use fchown, which should complete before the file descriptor is closed.

There is also one problem with PermissionEntry::apply() when used in this context:

Since we're doing an atomic write as either tedge or root, the temporary file will be owned by tedge or root but the target configuration file will likely be owned by some other user, so most likely we will always have to change file owner. We want the atomic write to look like a normal write so it has to preserve original ownership no matter what it is.

But it is technically possible for a file to have uid/gid of a user that does not exist on the system. Perhaps the file was created by a program coming from a package that set up its own user, and the package was since uninstalled and the user was removed. Still in this case we still want to preserve original permissions, so we have to use original uid/gid directly. But PermissionEntry::apply() uses a name, so if we wanted to use it here we'd need to look up the name of the owner of the file, which can fail.

So not using PermissionEntry::apply() avoids an extra failure mode.


tempfile.as_file().sync_all()?;

tempfile
.persist(dest)
.context("failed to persist temporary file at destination")?;

Ok(())
}

/// Computes target permissions for deployment of the config file.
///
/// - if file exists preserve current permissions
/// - if it doesn't exist apply permissions from `permissions` if they are defined
/// - set to root:root with default umask otherwise
///
/// # Errors
/// - if desired user/group doesn't exist on the system
/// - no permission to read destination file
fn config_file_target_permissions(
file_entry: &FileEntry,
dest: &Utf8Path,
) -> anyhow::Result<Permissions> {
let current_file_permissions = match std::fs::metadata(dest) {
Err(err) => match err.kind() {
ErrorKind::PermissionDenied => return Err(err).context("no permissions"),
ErrorKind::NotFound => None,
_ => return Err(err).context("unexpected IO error"),
},
Ok(p) => Some(p),
};

let entry_uid = if let Some(ref u) = file_entry.file_permissions.user {
let uid = uzers::get_user_by_name(u)
.with_context(|| format!("no such user: '{u}'"))?
.uid();
Some(uid)
} else {
None
};

let entry_gid = if let Some(ref g) = file_entry.file_permissions.group {
let gid = uzers::get_group_by_name(g)
.with_context(|| format!("no such group: '{g}'"))?
.gid();
Some(gid)
} else {
None
};
let entry_mode = file_entry.file_permissions.mode;

let uid = current_file_permissions
.as_ref()
.map(|p| p.uid())
.or(entry_uid)
.unwrap_or(0);

let gid = current_file_permissions
.as_ref()
.map(|p| p.gid())
.or(entry_gid)
.unwrap_or(0);

let mode = current_file_permissions
.as_ref()
.map(|p| p.mode())
.or(entry_mode)
.unwrap_or(0o644);

Ok(Permissions { uid, gid, mode })
}

struct Permissions {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not reuse tedge_utils::file::PermissionEntry?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#3069 (comment)

We want to use uid/gid directly instead of names, since doing so avoids an extra failure mode.

uid: u32,
gid: u32,
mode: u32,
}

#[derive(Debug, PartialEq, Eq, Clone)]
pub enum ConfigOperation {
Snapshot(Topic, ConfigSnapshotCmdPayload),
Expand Down
2 changes: 1 addition & 1 deletion crates/extensions/tedge_config_manager/src/error.rs
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ pub enum ConfigManagementError {
#[error(transparent)]
FromAtomFileError(#[from] tedge_utils::fs::AtomFileError),

#[error(transparent)]
#[error("{0:#}")]
Other(#[from] anyhow::Error),
}

Expand Down
Loading