Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve error message on bad working directory #2317

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

artm
Copy link

@artm artm commented Aug 24, 2024

Handles two cases:

  • working directory doesn't exist.
  • working directory cannot be changed into due to on of the following:
    • The process lacks permissions to view the contents.
    • The path points at a non-directory file.

Addresses: #2295

Copy link
Owner

@casey casey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, see comments!

src/error.rs Outdated Show resolved Hide resolved
src/error.rs Outdated Show resolved Hide resolved
src/error.rs Outdated Show resolved Hide resolved
src/recipe.rs Outdated Show resolved Hide resolved
tests/working_directory.rs Outdated Show resolved Hide resolved
@casey
Copy link
Owner

casey commented Aug 24, 2024

And good idea using the read_dir function to test if it's a real directory.

@artm artm requested a review from casey August 25, 2024 04:13
@casey
Copy link
Owner

casey commented Aug 25, 2024

Actually, I just had a thought: As written, this will mask the original I/O error. First we'll get error 1, then we'll check if it's related to the working directory and possibly get error 2, and if we got error 2, display it, but not display error 1. I think we should probably never suppress error 1, since the two errors may be distinct, for example if the shell binary is not found and the working directory doesn't exist.

Some ideas:

  1. Include a working_directory_error: bool flag on Error::Io which is set if we have a problem with the working directory, and if true, add something to the error message, like "this might be an error with the working directory",
  2. Include a working_directory_io_error: Option<io::Error> on Error::Io, which is set if we get an error reading the working directory, and if set, we add error message to the output, something like also got error reading working directory: …

I'm not sure what's best. I think we definitely don't want to suppress the original error message, since that's important, but I'm not sure what level of additional verbosity is correct.

@artm
Copy link
Author

artm commented Aug 26, 2024

Is just in general striving to display all errors it can detect on start up?

Because my thinking was: even if we mask e.g. missing shell error, the missing working directory error would be tackled first, then the user would get missing shell error. Two of theese at the same time is a rare condition, is displaying both errors at the same time worth the additional code complexity?

If the general rule however is "display as many errors as can be detected", then I think another possible approach is to have a generic "composite error" with e.g. vector of simple errors inside, instead of trying to capture a particular combination of conditions like "io errors regarding shell and working directory".

This discussion also reminds me of a situation I have encountered with other software, where the IO error on command execution would be erroneously blamed on shell, while it was actually related to some file acceessed by the command. We probably shouldn't assume the error is either with shell or working directory, it could be with some other file.

@artm
Copy link
Author

artm commented Aug 26, 2024

I think there are in general three possible sources of io error and we only receive one of them

  1. io error accessing working dir
  2. io error accessing shell
  3. io error executing the shell command

The first two can happen at the same time, the third can only be encountered in the absence of the other two.

Whenever we encounter the io error, we want to be as helpful as possible and investigate if 1 or 2 (or both) are the case and display as exact a message or two as possible. Since 1 and 2 may happen at the same time they can be captured by a singel Error variant as you suggested. 3 should probably be a separate variant since it will never happen simultaneously with 1-adn/or-2.

Perhaps call them CommandStartupIo and CommandIo?

@artm
Copy link
Author

artm commented Aug 26, 2024

I was confused, IO error in a spawned command won't be returned as rust std::io::Error, I was misremembering something :)

I'll think more about it and implement something along the lines of your suggestions, @casey

@artm
Copy link
Author

artm commented Aug 31, 2024

@casey:

I think we should probably never suppress error 1, since the two errors may be distinct, for example if the shell binary is not found and the working directory doesn't exist.

I think the latter combination could never happen because the implementation of Command does't check if command executable can be executed, it first asks the OS to change current directory, then asks it to execute the command. If current directory couldn't be changed, the attempting to execute would never happen, and the io error that we get couldn't be because of the bad shell binary.

There are however more system calls around changing directory and executing the command which could potentially return an IO error that we're getting.

This makes me think: we can be sure if working directory is bad, because we check it explicitly, but we can't be sure shell executable is bad. But the latter is a reasonable probable cause of the original io error and it would be helpful to suggest the user to check the shell.

So I'm considering your suggestion:

@casey:

  1. Include a working_directory_io_error: Option<io::Error> on Error::Io, which is set if we get an error reading the working directory, and if set, we add error message to the output, something like also got error reading working directory: …

The error message should contain:

  • explanation of the context (Recipe couldn't be run:)
  • original io error's message
  • a suggestion that it could be related to bad/missing shell (perhaps as a quiestion: Is a valid shell executable?)
  • if working_directory_io_error is Some:
    • "Probable cause: working directory can't be changed to < working_directory >: <working_directory_io_error's message>"

@casey
Copy link
Owner

casey commented Aug 31, 2024

I think the latter combination could never happen because the implementation of Command does't check if command executable can be executed, it first asks the OS to change current directory, then asks it to execute the command. If current directory couldn't be changed, the attempting to execute would never happen, and the io error that we get couldn't be because of the bad shell binary.

Oh interesting! Can you link to the implementation? Is it the same on all platforms?

The error message should contain:

  • explanation of the context (Recipe couldn't be run:)

  • original io error's message

  • a suggestion that it could be related to bad/missing shell (perhaps as a quiestion: Is a valid shell executable?)

  • if working_directory_io_error is Some:

    • "Probable cause: working directory can't be changed to < working_directory >: <working_directory_io_error's message>"

This seems reasonable to me!

@artm
Copy link
Author

artm commented Sep 1, 2024

Oh interesting! Can you link to the implementation? Is it the same on all platforms?

it isn't the same, I have only followed the unix implementation and assumed the overall shape would be similar on other platforms, but see below.

It works like that: the process::Command is a façade to the platform specific implementation. The process::Command.status() calls implementation specific spawn(), which on unix forks the process then calls do_exec() in the new process, which eventually calls chdir and later calls execvp.

Out of curiosity I had a look at the windows implementation of spawn() now. As far as I can make out it passes the desired working directory to a system call CreateProcessW, so I can't be sure my earlier analysis holds on windows :)

@artm
Copy link
Author

artm commented Sep 1, 2024

And here is the windows documentation for CreateProcessW and it is unclear to me when it deals with the working directory / executable path, but the details are fascinating :) It sounds more complicated than I've assumed.

So we simply should't try to presume that we understand the root cause of the io::Error returned from Command.status(), but give the user enough context details / hints.

@artm
Copy link
Author

artm commented Sep 1, 2024

I have made another attempt at clear error messages like outlined above.
I'm not quite satisfied yet as now the os error message is repeated whenever the root cause is a bad/missing working directory, e.g.:

error: Failed to run recipe `default`:
  Failed to run shell `bash`:
    No such file or directory (os error 2)
Failed to set working directory to `/tmp/user/1000/just/just-test-tempdirvDaBh5/missing`:
  No such file or directory (os error 2)

error: Failed to run recipe `default`:
  Failed to run shell `bash`:
    Permission denied (os error 13)
Failed to set working directory to `/tmp/user/1000/just/just-test-tempdirPg6zS8/unusable`:
  Permission denied (os error 13)

error: Failed to run recipe `default`:
  Failed to run shell `bash`:
    Not a directory (os error 20)
Failed to set working directory to `/tmp/user/1000/just/just-test-tempdirXhmAY6/unusable`:
  Not a directory (os error 20)

that's because Command.status() and read_dir now encounter the same io error.

@artm
Copy link
Author

artm commented Sep 1, 2024

We could do the following: when there is working_directory_io_error, check if its message differs from the original io_error and only print it in that case. I know that on unix it will never be printed, but this way we avoid assuming it is always the case.

I'll also work on better wording / formatting.

@artm
Copy link
Author

artm commented Sep 1, 2024

I now have the error messages look like:

error: Failed to run recipe `default`:
  Failed to run shell `bash`:
    No such file or directory (os error 2)
  Failed to set working directory to `/tmp/user/1000/just/just-test-tempdirwpKZH6/missing`

error: Failed to run recipe `default`:
  Failed to run shell `bash`:
    Permission denied (os error 13)
  Failed to set working directory to `/tmp/user/1000/just/just-test-tempdiriR6HHU/unusable`

error: Failed to run recipe `default`:
  Failed to run shell `bash`:
    Not a directory (os error 20)
  Failed to set working directory to `/tmp/user/1000/just/just-test-tempdirQPyrIm/unusable`

the second io error message is hidden because it is the same as the first. If they were different it would look like:

error: Failed to run recipe `default`:
  Failed to run shell `bash`:
    <original io error message>
  Failed to set working directory to `/tmp/user/1000/just/just-test-tempdirQPyrIm/unusable`:
    Not a directory (os error 20)

I think now the user would have enogh context to analyse various causes of the error: the OS error messages, the working directory and the shell that was attempted to be called.

I would like to add another group of tests where both shell and the working directory are unusable and temporarily add a debug output to see in CI what the error messages look like on macos and windows in different cases.

@artm artm force-pushed the missing-workdir-error-message branch 2 times, most recently from 1d9a50d to bb95c38 Compare September 1, 2024 10:25
@artm
Copy link
Author

artm commented Sep 1, 2024

I'm satisfied with the error messages on linux now, but am curious what the OS specific ones look like on windows. Could I temporarily break the tests so we could see the actual error messages in CI, @casey ?

@casey
Copy link
Owner

casey commented Sep 2, 2024

@artm Yah, of course, feel free to push broken tests to see what they look like. You might need to ping me so I can start them, since they don't run automatically for new contributors.

I just had a thought, I think there are three cases:

  1. No error when accessing the working directory. In this case, we don't need to change the error message, since the error isn't related to the working directory.
  2. Error when accessing the working directory, and the error is different from the original error the user received. In this case, there is a problem with the working directory, but there's some other error which the user is getting. I think in this case too, we don't need to change the error message, since there's another error that the user has to fix first, and then once they fix that, they'll get the working directory error message which they'll need to fix.
  3. Error when accessing the working directory, and the error is the same as the original error. I think only in this case should we change the error message, and since the two errors are the same, we don't need to print the error message twice, just note that there's likely an error with the working directory, and print the path to the working directory.

What do you think?

@artm
Copy link
Author

artm commented Sep 3, 2024

Whatever the case we want to help the user analyse the problem, so we want to include as much context as is relevant in the error message: recipe, shell, working directory. Your cases sound reasonable, what is the relevant context in each case:

  1. working directory is usable, but perhaps the shell is not, or some other cause of the io error. Every bit of the context could be relevant: for example the working directory could be used to look for shell in some cases. The error message in my current implementation would be:
error: Failed to run recipe `default`:
  Failed to run shell `bash`:
    <original io error message>

it misses the working directory in the context. We know the w.d. is usable, but it could still be relevant to understanding the error in a rare case where the shell path is relative, in which case it is relative to the w.d. Also if I understand it correctly, on windows currect directory is alwyas used to resolve the non-absolute executable path. Should the working directory be added to the error message?

  1. working directory access error is different from the original error. This is the theoretical case where an IO error happens somewhere before chdir / CreateProcessW. Currently my code would display both errors with complete context. I will make it that only the original error is displayed, with the context.

Currently the error message would be:

error: Failed to run recipe `default`:
  Failed to run shell `bash`:
    <original io error message>
  Failed to set working directory to `/tmp/user/1000/just/just-test-tempdirQPyrIm/unusable`:
    <w.d. access io error message>

should become

error: Failed to run recipe `default`:
  Failed to run shell `bash`:
    <original io error message>

plus w.d. in the context if we agree it could be relevant.

  1. workng directory access error is the same as the original error, the original error is most probably caused by the bad working directory. The current code displays the original error with context, plus additinal message about unusable working directory.

current error message would be:

error: Failed to run recipe `default`:
  Failed to run shell `bash`:
    <original io error message, same as w.d. access io error message>
  Failed to set working directory to `/tmp/user/1000/just/just-test-tempdiriR6HHU/unusable`

should we keep it as is? or should w.d. path be moved to the context (second line) and the last line be formulated differently?

@casey
Copy link
Owner

casey commented Sep 6, 2024

  1. … Should the working directory be added to the error message?

I think probably not, just to keep the error message short readable, and since the working directory is likely not the issue.

  1. workng directory access error is the same as the original error, the original error is most probably caused by the bad working directory. The current code displays the original error with context, plus additinal message about unusable working directory.

current error message would be:

error: Failed to run recipe `default`:
  Failed to run shell `bash`:
    <original io error message, same as w.d. access io error message>
  Failed to set working directory to `/tmp/user/1000/just/just-test-tempdiriR6HHU/unusable`

should we keep it as is? or should w.d. path be moved to the context (second line) and the last line be formulated differently?

I think what you have here is probably good for both 2 and 3. We always display the original error message, but in both cases add Failed to access working directory PATH.

@artm
Copy link
Author

artm commented Sep 11, 2024

@casey If I'm not missing anything, the current implementation is already doing what we want, could you review and let the CI run?

Handles two cases:

- working directory doesn't exist.
- working directory cannot be changed into due to on of the following:
    - The process lacks permissions to view the contents.
    - The path points at a non-directory file.

Addresses: casey#2295
- include the failed recipe, shell and working directory
- include the original io error message from Command.status()
- include the io error message from read_dir() check if it differs from
  the original

On Unix the two error messages don't differ in the cases that we test,
but we don't want to assume that is always the case.

Additionally test error messages when both workdir and shell are
unusable.
@artm artm force-pushed the missing-workdir-error-message branch from 4e4a6aa to 064a505 Compare September 11, 2024 04:44
Copy link
Owner

@casey casey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error message feels pretty bit noisy to me. It includes the recipe, the shell, and I/O error, and may also include an additional I/O error. Overall, I think it's a lot of information to include in an error message, so I'm rethinking this approach.

How about a single clear line which includes the original I/O error, followed by lines which suggest looking at the shell or the working directory?

If there is no error when reading the working directory, we would print this:

error: Recipe `{recipe}` could not run because of an I/O error when launching the shell: {io_error}
  This may be due to an issue with the shell: `{shell}`

And if there is an error when reading the working directory, we would print this

error: Recipe `{recipe}` could not be run because of an I/O error when launching the shell: {io_error}
  This may be due to an issue with the shell: `{shell}`
  Or with the working directory: `{working_directory}`

I think including multiple errors is probably too much. Sorry for waffling!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants