Improve error message on bad working directory #2317

artm · 2024-08-24T11:40:14Z

Handles two cases:

working directory doesn't exist.
working directory cannot be changed into due to on of the following:
- The process lacks permissions to view the contents.
- The path points at a non-directory file.

Addresses: #2295

casey

Nice, see comments!

src/error.rs

src/recipe.rs

tests/working_directory.rs

casey · 2024-08-24T20:47:54Z

And good idea using the read_dir function to test if it's a real directory.

casey · 2024-08-25T22:09:08Z

Actually, I just had a thought: As written, this will mask the original I/O error. First we'll get error 1, then we'll check if it's related to the working directory and possibly get error 2, and if we got error 2, display it, but not display error 1. I think we should probably never suppress error 1, since the two errors may be distinct, for example if the shell binary is not found and the working directory doesn't exist.

Some ideas:

Include a working_directory_error: bool flag on Error::Io which is set if we have a problem with the working directory, and if true, add something to the error message, like "this might be an error with the working directory",
Include a working_directory_io_error: Option<io::Error> on Error::Io, which is set if we get an error reading the working directory, and if set, we add error message to the output, something like also got error reading working directory: …

I'm not sure what's best. I think we definitely don't want to suppress the original error message, since that's important, but I'm not sure what level of additional verbosity is correct.

artm · 2024-08-26T04:35:57Z

Is just in general striving to display all errors it can detect ~~on start up~~?

Because my thinking was: even if we mask e.g. missing shell error, the missing working directory error would be tackled first, then the user would get missing shell error. Two of theese at the same time is a rare condition, is displaying both errors at the same time worth the additional code complexity?

If the general rule however is "display as many errors as can be detected", then I think another possible approach is to have a generic "composite error" with e.g. vector of simple errors inside, instead of trying to capture a particular combination of conditions like "io errors regarding shell and working directory".

This discussion also reminds me of a situation I have encountered with other software, where the IO error on command execution would be erroneously blamed on shell, while it was actually related to some file acceessed by the command. We probably shouldn't assume the error is either with shell or working directory, it could be with some other file.

artm · 2024-08-26T05:07:49Z

I think there are in general three possible sources of io error and we only receive one of them

io error accessing working dir
io error accessing shell
io error executing the shell command

The first two can happen at the same time, the third can only be encountered in the absence of the other two.

Whenever we encounter the io error, we want to be as helpful as possible and investigate if 1 or 2 (or both) are the case and display as exact a message or two as possible. Since 1 and 2 may happen at the same time they can be captured by a singel Error variant as you suggested. 3 should probably be a separate variant since it will never happen simultaneously with 1-adn/or-2.

Perhaps call them CommandStartupIo and CommandIo?

artm · 2024-08-26T17:04:24Z

I was confused, IO error in a spawned command won't be returned as rust std::io::Error, I was misremembering something :)

I'll think more about it and implement something along the lines of your suggestions, @casey

artm · 2024-08-31T06:51:55Z

@casey:

I think we should probably never suppress error 1, since the two errors may be distinct, for example if the shell binary is not found and the working directory doesn't exist.

I think the latter combination could never happen because the implementation of Command does't check if command executable can be executed, it first asks the OS to change current directory, then asks it to execute the command. If current directory couldn't be changed, the attempting to execute would never happen, and the io error that we get couldn't be because of the bad shell binary.

There are however more system calls around changing directory and executing the command which could potentially return an IO error that we're getting.

This makes me think: we can be sure if working directory is bad, because we check it explicitly, but we can't be sure shell executable is bad. But the latter is a reasonable probable cause of the original io error and it would be helpful to suggest the user to check the shell.

So I'm considering your suggestion:

@casey:

Include a working_directory_io_error: Option<io::Error> on Error::Io, which is set if we get an error reading the working directory, and if set, we add error message to the output, something like also got error reading working directory: …

The error message should contain:

explanation of the context (Recipe couldn't be run:)
original io error's message
a suggestion that it could be related to bad/missing shell (perhaps as a quiestion: Is a valid shell executable?)
if working_directory_io_error is Some:
- "Probable cause: working directory can't be changed to < working_directory >: <working_directory_io_error's message>"

casey · 2024-08-31T19:29:08Z

I think the latter combination could never happen because the implementation of Command does't check if command executable can be executed, it first asks the OS to change current directory, then asks it to execute the command. If current directory couldn't be changed, the attempting to execute would never happen, and the io error that we get couldn't be because of the bad shell binary.

Oh interesting! Can you link to the implementation? Is it the same on all platforms?

The error message should contain:

explanation of the context (Recipe couldn't be run:)

original io error's message

a suggestion that it could be related to bad/missing shell (perhaps as a quiestion: Is a valid shell executable?)

if working_directory_io_error is Some:

"Probable cause: working directory can't be changed to < working_directory >: <working_directory_io_error's message>"

This seems reasonable to me!

artm · 2024-09-01T05:13:53Z

Oh interesting! Can you link to the implementation? Is it the same on all platforms?

it isn't the same, I have only followed the unix implementation and assumed the overall shape would be similar on other platforms, but see below.

It works like that: the process::Command is a façade to the platform specific implementation. The process::Command.status() calls implementation specific spawn(), which on unix forks the process then calls do_exec() in the new process, which eventually calls chdir and later calls execvp.

Out of curiosity I had a look at the windows implementation of spawn() now. As far as I can make out it passes the desired working directory to a system call CreateProcessW, so I can't be sure my earlier analysis holds on windows :)

artm · 2024-09-01T05:25:29Z

And here is the windows documentation for CreateProcessW and it is unclear to me when it deals with the working directory / executable path, but the details are fascinating :) It sounds more complicated than I've assumed.

So we simply should't try to presume that we understand the root cause of the io::Error returned from Command.status(), but give the user enough context details / hints.

artm · 2024-09-01T06:57:03Z

I have made another attempt at clear error messages like outlined above.
I'm not quite satisfied yet as now the os error message is repeated whenever the root cause is a bad/missing working directory, e.g.:

error: Failed to run recipe `default`:
  Failed to run shell `bash`:
    No such file or directory (os error 2)
Failed to set working directory to `/tmp/user/1000/just/just-test-tempdirvDaBh5/missing`:
  No such file or directory (os error 2)

error: Failed to run recipe `default`:
  Failed to run shell `bash`:
    Permission denied (os error 13)
Failed to set working directory to `/tmp/user/1000/just/just-test-tempdirPg6zS8/unusable`:
  Permission denied (os error 13)

error: Failed to run recipe `default`:
  Failed to run shell `bash`:
    Not a directory (os error 20)
Failed to set working directory to `/tmp/user/1000/just/just-test-tempdirXhmAY6/unusable`:
  Not a directory (os error 20)

that's because Command.status() and read_dir now encounter the same io error.

artm · 2024-09-01T07:05:07Z

We could do the following: when there is working_directory_io_error, check if its message differs from the original io_error and only print it in that case. I know that on unix it will never be printed, but this way we avoid assuming it is always the case.

I'll also work on better wording / formatting.

artm · 2024-09-01T07:41:08Z

I now have the error messages look like:

error: Failed to run recipe `default`:
  Failed to run shell `bash`:
    No such file or directory (os error 2)
  Failed to set working directory to `/tmp/user/1000/just/just-test-tempdirwpKZH6/missing`

error: Failed to run recipe `default`:
  Failed to run shell `bash`:
    Permission denied (os error 13)
  Failed to set working directory to `/tmp/user/1000/just/just-test-tempdiriR6HHU/unusable`

error: Failed to run recipe `default`:
  Failed to run shell `bash`:
    Not a directory (os error 20)
  Failed to set working directory to `/tmp/user/1000/just/just-test-tempdirQPyrIm/unusable`

the second io error message is hidden because it is the same as the first. If they were different it would look like:

error: Failed to run recipe `default`:
  Failed to run shell `bash`:
    <original io error message>
  Failed to set working directory to `/tmp/user/1000/just/just-test-tempdirQPyrIm/unusable`:
    Not a directory (os error 20)

I think now the user would have enogh context to analyse various causes of the error: the OS error messages, the working directory and the shell that was attempted to be called.

I would like to add another group of tests where both shell and the working directory are unusable and temporarily add a debug output to see in CI what the error messages look like on macos and windows in different cases.

artm · 2024-09-01T10:28:54Z

I'm satisfied with the error messages on linux now, but am curious what the OS specific ones look like on windows. Could I temporarily break the tests so we could see the actual error messages in CI, @casey ?

casey · 2024-09-02T19:38:12Z

@artm Yah, of course, feel free to push broken tests to see what they look like. You might need to ping me so I can start them, since they don't run automatically for new contributors.

I just had a thought, I think there are three cases:

No error when accessing the working directory. In this case, we don't need to change the error message, since the error isn't related to the working directory.
Error when accessing the working directory, and the error is different from the original error the user received. In this case, there is a problem with the working directory, but there's some other error which the user is getting. I think in this case too, we don't need to change the error message, since there's another error that the user has to fix first, and then once they fix that, they'll get the working directory error message which they'll need to fix.
Error when accessing the working directory, and the error is the same as the original error. I think only in this case should we change the error message, and since the two errors are the same, we don't need to print the error message twice, just note that there's likely an error with the working directory, and print the path to the working directory.

What do you think?

artm · 2024-09-03T05:32:29Z

Whatever the case we want to help the user analyse the problem, so we want to include as much context as is relevant in the error message: recipe, shell, working directory. Your cases sound reasonable, what is the relevant context in each case:

working directory is usable, but perhaps the shell is not, or some other cause of the io error. Every bit of the context could be relevant: for example the working directory could be used to look for shell in some cases. The error message in my current implementation would be:

error: Failed to run recipe `default`:
  Failed to run shell `bash`:
    <original io error message>

it misses the working directory in the context. We know the w.d. is usable, but it could still be relevant to understanding the error in a rare case where the shell path is relative, in which case it is relative to the w.d. Also if I understand it correctly, on windows currect directory is alwyas used to resolve the non-absolute executable path. Should the working directory be added to the error message?

working directory access error is different from the original error. This is the theoretical case where an IO error happens somewhere before chdir / CreateProcessW. Currently my code would display both errors with complete context. I will make it that only the original error is displayed, with the context.

Currently the error message would be:

error: Failed to run recipe `default`:
  Failed to run shell `bash`:
    <original io error message>
  Failed to set working directory to `/tmp/user/1000/just/just-test-tempdirQPyrIm/unusable`:
    <w.d. access io error message>

should become

error: Failed to run recipe `default`:
  Failed to run shell `bash`:
    <original io error message>

plus w.d. in the context if we agree it could be relevant.

workng directory access error is the same as the original error, the original error is most probably caused by the bad working directory. The current code displays the original error with context, plus additinal message about unusable working directory.

current error message would be:

error: Failed to run recipe `default`:
  Failed to run shell `bash`:
    <original io error message, same as w.d. access io error message>
  Failed to set working directory to `/tmp/user/1000/just/just-test-tempdiriR6HHU/unusable`

should we keep it as is? or should w.d. path be moved to the context (second line) and the last line be formulated differently?

casey · 2024-09-06T22:19:48Z

… Should the working directory be added to the error message?

I think probably not, just to keep the error message short readable, and since the working directory is likely not the issue.

workng directory access error is the same as the original error, the original error is most probably caused by the bad working directory. The current code displays the original error with context, plus additinal message about unusable working directory.

current error message would be:
error: Failed to run recipe `default`:
  Failed to run shell `bash`:
    <original io error message, same as w.d. access io error message>
  Failed to set working directory to `/tmp/user/1000/just/just-test-tempdiriR6HHU/unusable`
should we keep it as is? or should w.d. path be moved to the context (second line) and the last line be formulated differently?

I think what you have here is probably good for both 2 and 3. We always display the original error message, but in both cases add Failed to access working directory PATH.

artm · 2024-09-11T04:42:40Z

@casey If I'm not missing anything, the current implementation is already doing what we want, could you review and let the CI run?

Handles two cases: - working directory doesn't exist. - working directory cannot be changed into due to on of the following: - The process lacks permissions to view the contents. - The path points at a non-directory file. Addresses: casey#2295

- include the failed recipe, shell and working directory - include the original io error message from Command.status() - include the io error message from read_dir() check if it differs from the original On Unix the two error messages don't differ in the cases that we test, but we don't want to assume that is always the case. Additionally test error messages when both workdir and shell are unusable.

casey

This error message feels pretty bit noisy to me. It includes the recipe, the shell, and I/O error, and may also include an additional I/O error. Overall, I think it's a lot of information to include in an error message, so I'm rethinking this approach.

How about a single clear line which includes the original I/O error, followed by lines which suggest looking at the shell or the working directory?

If there is no error when reading the working directory, we would print this:

error: Recipe `{recipe}` could not run because of an I/O error when launching the shell: {io_error}
  This may be due to an issue with the shell: `{shell}`

And if there is an error when reading the working directory, we would print this

error: Recipe `{recipe}` could not be run because of an I/O error when launching the shell: {io_error}
  This may be due to an issue with the shell: `{shell}`
  Or with the working directory: `{working_directory}`

I think including multiple errors is probably too much. Sorry for waffling!

casey requested changes Aug 24, 2024

View reviewed changes

src/error.rs Outdated Show resolved Hide resolved

src/error.rs Outdated Show resolved Hide resolved

src/error.rs Outdated Show resolved Hide resolved

src/recipe.rs Outdated Show resolved Hide resolved

tests/working_directory.rs Outdated Show resolved Hide resolved

artm requested a review from casey August 25, 2024 04:13

artm force-pushed the missing-workdir-error-message branch 2 times, most recently from 1d9a50d to bb95c38 Compare September 1, 2024 10:25

artm added 6 commits September 11, 2024 04:43

Allow calling chmod in tests

c3d0c8b

Implement the PR suggestions

98b6e4b

Fix a lint: unnecessary borrow

fd14aec

Only provide Test.chmod on unix, where it works

064a505

artm force-pushed the missing-workdir-error-message branch from 4e4a6aa to 064a505 Compare September 11, 2024 04:44

casey requested changes Sep 14, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve error message on bad working directory #2317

Improve error message on bad working directory #2317

artm commented Aug 24, 2024

casey left a comment

casey commented Aug 24, 2024

casey commented Aug 25, 2024

artm commented Aug 26, 2024 •

edited

Loading

artm commented Aug 26, 2024

artm commented Aug 26, 2024

artm commented Aug 31, 2024 •

edited

Loading

casey commented Aug 31, 2024

artm commented Sep 1, 2024 •

edited

Loading

artm commented Sep 1, 2024

artm commented Sep 1, 2024

artm commented Sep 1, 2024

artm commented Sep 1, 2024

artm commented Sep 1, 2024

casey commented Sep 2, 2024

artm commented Sep 3, 2024 •

edited

Loading

casey commented Sep 6, 2024 •

edited

Loading

artm commented Sep 11, 2024

casey left a comment

Improve error message on bad working directory #2317

Are you sure you want to change the base?

Improve error message on bad working directory #2317

Conversation

artm commented Aug 24, 2024

casey left a comment

Choose a reason for hiding this comment

casey commented Aug 24, 2024

casey commented Aug 25, 2024

artm commented Aug 26, 2024 • edited Loading

artm commented Aug 26, 2024

artm commented Aug 26, 2024

artm commented Aug 31, 2024 • edited Loading

casey commented Aug 31, 2024

artm commented Sep 1, 2024 • edited Loading

artm commented Sep 1, 2024

artm commented Sep 1, 2024

artm commented Sep 1, 2024

artm commented Sep 1, 2024

artm commented Sep 1, 2024

casey commented Sep 2, 2024

artm commented Sep 3, 2024 • edited Loading

casey commented Sep 6, 2024 • edited Loading

artm commented Sep 11, 2024

casey left a comment

Choose a reason for hiding this comment

artm commented Aug 26, 2024 •

edited

Loading

artm commented Aug 31, 2024 •

edited

Loading

artm commented Sep 1, 2024 •

edited

Loading

artm commented Sep 3, 2024 •

edited

Loading

casey commented Sep 6, 2024 •

edited

Loading