Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sourced files can't be always resolved #63

Open
jansorg opened this issue Dec 31, 2024 · 4 comments
Open

sourced files can't be always resolved #63

jansorg opened this issue Dec 31, 2024 · 4 comments

Comments

@jansorg
Copy link
Collaborator

jansorg commented Dec 31, 2024

Follow up to #50
I think that I have a better understanding now of what's going on.

For this script:

cd ./some-dir
. ../lib.bash
cd ~
function_from_lib.bash

The trap handler invoked when function_from_lib.bash is executed, has this data:

  • $BASH_SOURCE[1] == ../lib.bash
  • $PWD == $HOME

To manage breakpoints, we have to resolve ../lib.bash to its source file.

But atm we're only attempting to guess its directory by looking into directories${_Dbg_init_cwd}, $_Dbg_cdir and $(pwd).
But files can be sourced from any directory, based on the working directory at the time source is called.

To reliably find the source file of ../lib.bash we need to know the working directory when source ../lib.bash is called, i.e. we need to know that it's /path/to/some-dir.

With debug output added to the TRAP handler, this is what's available when cd some-dir and . ../lib.bash is called:

TRAP: commmand='cd 'some-dir'', PWD=/Work/source/bashdb/tmp/sources, BASH_SOURCE=../lib/hook.sh sources/main.bash ../bashdb
TRAP: commmand='. ../lib.bash', PWD=/Work/source/bashdb/tmp/sources/some-dir, BASH_SOURCE=../lib/hook.sh sources/main.bash ../bashdb

Possible approach: In the trap handler, it may work to detect . and source commands in $_Dbg_bash_command and store the absolute path of ../lib.bash based on the current $PWD. This would involve parsing the bash command and there won't be a complete solution handling all edge-cases. But managing the basic source ./file/path ignored-args... should be possible, I think.

I'm not an expert with trap handlers and it's possible that I'm not aware of other, possible solutions.

@rocky Do you think that this may work? Do you perhaps know a better approach to solve this?

@rocky
Copy link
Collaborator

rocky commented Jan 2, 2025

I don't understand.

The POSIX shell debuggers like bashdb call bashdb's callback hook (lib/hook.sh) before running the first line of a source'd file. At that point, using the current working directory and what is found in BASH_SOURCE should always be able to resolve to an absolute path the file path where the next command is coming from, same as bash interpreter itself has to be able to resolve the path name to read lines from it.

Given this, the code in lib.hook.sh tries to resolve path names and saves the association between the relative name and the absolute name.

You might be able to come up with some sneaky kind of program where sourcing the same relative path name appears more than once and refers to different places due to a change in cwd:

cd /tmp
source foo.sh
cd /home/joe
source foo.sh  # different foo.sh from above
# here we've already associated foo.sh with /tmp/foo.sh and that might take precedence over /home/joe/foo.sh

For this, one would need more elaborate mechanisms. But until we run into this case, I am happy to postpone working on it.

@jansorg
Copy link
Collaborator Author

jansorg commented Jan 2, 2025

@rocky Thanks! I'll try to clarify what I meant.

As far as I understand:
if BASH_SOURCE is a relative path, then it's the value passed to source. This path is NOT relative to the current working directory.

Here's a self-contained example:

#!/usr/bin/env bash

mkdir -p "libs"
echo "myFunc() { echo \"myFunc(): PWD='\$PWD' BASH_SOURCE='\${BASH_SOURCE[0]}'\"; }" > "libs/include.sh"

mkdir -p "some/other/dir"
cd "some/other/dir"

set -o functrace
trap 'echo [DEBUG] $BASH_COMMAND PWD=$PWD BASH_SOURCE=${BASH_SOURCE[0]}' DEBUG
. "../../../libs/include.sh"

cd ../../..
myFunc

If I run this with Bash 5.2, then it prints this:

[jansorg@Island tmp]$ bash bash_test.bash 
[DEBUG] . "../../../libs/include.sh" PWD=/tmp/some/other/dir BASH_SOURCE=bash_test.bash
[DEBUG] cd ../../.. PWD=/tmp/some/other/dir BASH_SOURCE=bash_test.bash
[DEBUG] myFunc PWD=/tmp BASH_SOURCE=bash_test.bash
[DEBUG] myFunc PWD=/tmp BASH_SOURCE=../../../libs/include.sh
[DEBUG] echo "myFunc(): PWD='$PWD' BASH_SOURCE='${BASH_SOURCE[0]}'" PWD=/tmp BASH_SOURCE=../../../libs/include.sh
myFunc(): PWD='/tmp' BASH_SOURCE='../../../libs/include.sh'

The relevant line is [DEBUG] myFunc PWD=/tmp BASH_SOURCE=../../../libs/include.sh.
BASH_SOURCE is the path passed to source. If it was relative to the current working directory, then it would have been ./libs/include.sh.

The only solution I can think of is to record and resolve sourced paths in this trap handler call:
[DEBUG] . "../../../libs/include.sh" PWD=/tmp/some/other/dir BASH_SOURCE=bash_test.bash

It would be much better if Bash itself fixed this by always passing absolute paths in $BASH_SOURCE.

Of course, it's possible that I'm missing something.

@rocky
Copy link
Collaborator

rocky commented Jan 3, 2025

The relevant line is [DEBUG] myFunc PWD=/tmp BASH_SOURCE=../../../libs/include.sh.
BASH_SOURCE is the path passed to source. If it was relative to the current working directory, then it would have been ./libs/include.sh.

As you note, what is recorded in BASH_SOURCE is the $-expanded string (if the string needs $ expanding) that is found in "source" command.

In this particular situation, sourcing "include.sh" does not have any callback events. In contrast to say Python where there is a stop before defining a function, in bash there is no stop before lines that define functions.

If "include.sh" were instead:

func() { echo \"myFunc(): PWD='\$PWD' BASH_SOURCE='\${BASH_SOURCE[0]}'\"; }
x=1

The debugger as written would pick up the association because there is a stop before x=1

The only solution I can think of is to record and resolve sourced paths in this trap handler call:
[DEBUG] . "../../../libs/include.sh" PWD=/tmp/some/other/dir BASH_SOURCE=bash_test.bash

In POSIX shells, and bash in particular, parsing a command-line string is tricky because of the 2- or 3-phase substitution process. And these languages do not provide a "parse" function to parse into an AST as you find in Python and similar languages.

In short, Parsing the command string cannot be done easily or reliably to determine if it is a "source" command and, if so, what the source file name is.

It would be much better if Bash itself fixed this by always passing absolute paths in $BASH_SOURCE.

Yes, this is what I wrote in the third paragraph in #50 (comment)

Python made this same mistake in recording path names but has fixed it recently. I think Ruby made the same mistake too but I think it also fixed this early on. And the motivation for this there was for the same reason as here.

@jansorg
Copy link
Collaborator Author

jansorg commented Jan 3, 2025

Thank you for your feedback. You‘re right, the correct solution is to fix this in Bash and you mentioned that before.
I was hoping to be able to fix the case above for the current versions and Bash 4.x, too.

I‘ll post on the Bash mailing list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants