-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
filtering variable sets #52
Comments
This would certainly slow down execution, though I can clearly see the benefit. Because there's no real scoping, we'd have to run all user-defined functions and evaluate all variables in a subshell and deal with cleaning/preserving some environment variables. Besides the penalty from the subshell, we'd also have the filtering loop-within-a-loop causing more slowdowns. That's really a huge concern of mine, especially since I know there are places that generate hundreds or thousands of output files from templates and data, so doubling the processing time for those users would be bad. This sounds like a change with a high impact. I'm worried that you are taking data, functions, and templates from untrusted sources and running them. So, I suppose the task list would look something like this:
I don't know why I would support name mapping. That should be done when feeding data into |
Such is the approach I originally considered, but a more lightweight alternative is simply filtering at the stage in which the template processor resolves the template value, by inserting a query to an intermediary mapping table.
Why? Would use of associative arrays help avoid the scenario you envision requiring a nested loop?
The process could be optional, with the original performance being preserved for the case of not invoking the new behavior. The specific scenarios you describe, however, are unfamiliar to my experience. You seem to describe a case in which a Bash script delivers acceptable performance under conditions of high demand, but even so the performance would degrade unacceptably due to use of subshells.
A template operation applies a specific template to a specific data set to produce output. It should have no effects other than producing the output. Trust is not the important issue in my view because the effects of a template operation are limited to the contents of the output. The issue for me is robustness and predictability. Suppose a data set is given as { a=foo, b=bar, c=baz }. The output should not include the user name of the user running the template process, even if the template refers to a variable called USER. The template processor should consider the variable unset, because it is not a member of the data set. It should consider similarly unset any other variables that might be included for the other reasons in the environment (e.g. PYTHONPATH, XAUTHORITY, etc.). Trust of sources is not a concern that I understand as important to the discussion. Keeping the output predictable based on the template and data set is the central issue I am raising. |
It would be helpful to see at least a simple implementation of the essential functionality, possibly to be augmented later. A rough outline of how it might work is a follows. A new parameter would be added to the main processor. The form of the parameter value would be a list of words, each including a variable name followed optionally by a colon and a mapped name. For example, suppose the following two shell variables are defined:
The processor may be passed a value for the parameter as The modified view of the variable set may be achieved simply by adding an additional intermediary step when resolving the variable values during template processing, requiring minimal overall modification to the code. |
I wonder whether you have had a chance to consider further. Would you be open to reviewing a PR, assuming I could offer one, at least for a first design iteration? |
I apologize if I don't fully understand the issue here, but if the concern is that it might pollute the environment of the current shell, why not spin up an isolated shell for execution instead? env -i "$BASH" -c './mo ...args' This wouldn't require changing |
Currently, I believe the goal is to eliminate any variables that don't match. For security, we would also want to mask away any values from internal functions. The user should be able to exclude sensitive information like When implementing this, I want to make sure performance doesn't drastically suffer. There's a minimum of subshells created and content is returned indirectly for speed. When I mentioned loops within a loop, I'm specifically talking about how A last hurdle is keeping a copy of all variables when the program starts, then referencing that list when resolving the values. Unfortunately, this would also impact all user-defined functions because they will have to find a way to get access to these values as well. I am uncertain how this last goal could be achieved. Sorry for not responding over a year ago. I thought I had. I would like to see an approach detailed before a pull request, if possible, and still make it able to run on Bash 3.2 for Apple support. |
Forgive me, but I think that's the whole premise behind This is a bit of me being the pot calling the kettle black (given my latest, monstrous PR 😅 ), but I feel like we may be over-complicating things a bit. See
The easiest way to deal with this is using At most #!/usr/bin/env bash
# If the environment isn't clean, make it so!
[ ! -z "$HOME"] && exec env -i "$BASH" -c "$0 $@" Or something similar. Reinventing the wheel by implementing an entire subshell functionality seems like a heavy endeavor, rather than just starting off the parsing inside a sanitized environment to begin with using If neither of you get to it first, I can see what I can do to demonstrate the viability of using |
How would you propose removing a template's ability to access Running |
Fair point 😂 I might play around with something. Avoiding loops would be ideal; to your point, it would certainly degrade performance. But, ultimately, it would be good to only use vars that are defined after execution. |
@fidian @brainchild0 I've implemented a cursory solution that appears to work pretty well. Mind the test, of course. It's a little hairy. The limitation of this solution is that it only works if you use Thoughts? |
One of the current limitations is that templates are always evaluated against the full variable set of shell variables. Because these variables are set from a variety of sources, unpredictable and even unsafe results may emerge from invoking templates on them. A secondary concern is the difference between variable names used in some specific template versus those chosen for use the surrounding shell logic.
It would be advantageous to support an invocation case in which the template variables are filtered representation of the shell variables. Filtering may take in general two useful forms. The first is selection of which variables to include in template evaluation, with other variables excluded. The second is a name mapping, that is, support for an optional alternative name for any variable selected for inclusion during template processing.
An interface supporting such functionality might be an added parameter allowing specification of a white space-separated list of variables for inclusion, each one optionally followed by a colon and then an alternative name.
The text was updated successfully, but these errors were encountered: