Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revamping wrapper scripts in CURI #455

Open
patrickbkr opened this issue Dec 3, 2024 · 10 comments
Open

Revamping wrapper scripts in CURI #455

patrickbkr opened this issue Dec 3, 2024 · 10 comments
Labels
rakudo Big changes to Rakudo

Comments

@patrickbkr
Copy link
Member

patrickbkr commented Dec 3, 2024

(I know I break the rules by presenting a solution in the first post. But I mostly seek feedback on my proposal, I don't want to kick of a brainstorming round.)

The bat wrappers we generate on Windows don't kick it and never will. To give one example: It's currently impossible to run zef install 'ABC:ver<0.6.13>:auth<zef:colomon>' on Windows. The < and > characters don't survive a trip through CMD-land.

Thus I believe I need to pick up rakudo/rakudo#3716 (comment) again. Here is an updated copy of that post.

The following is my (potentially incorrect) understanding of how things work. Please correct me where I'm wrong.

A CURI (CompUnit::Repository::Installation) is a store represented as a folder for installed modules. It contains everything a module needs, including the actual something.raku script. All contents (including that script) are stored in a way opaque to the user. Running such an installed script requires calling the following Raku method:

CompUnit::RepositoryRegistry.run-script('some_script');

Wrappers which basically only execute the above call are installed in the curi_store/bin subfolder. For those wrapper scripts to work their path must be in PATH or they need to be called with their absolute path. (Sidenote: On Windows there are no shebang lines. There additional .bat files are generated.)

There are three standard CURIs every Rakudo installation brings with it: core, site and vendor installed into $prefix/share/perl6/. In principle the site and vendor CURIs could be shared between multiple Rakudo installations, but that doesn't make much sense, as every installation brings its own set with it.
There can be additional CURIs in arbitrary places. Which CURIs are used is configurable.

Relocatable Rakudo installations have a well defined directory layout. Things (rakudo executables, CURIs, libs, ...) are always in the same place relative to each other.

There are different situations we need to consider:

  • Standard CURIs, part of a non-relocatable installation: Typical case when Rakudo is installed by the distro. The CURIs belong to a specific Rakudo installation. -> Directly use the full path of the rakudo excutable in the shebang line.
  • Standard CURIs, part of a relocatable installation: Different approaches are possible:
    • Use env in the shebang line and hope that the PATH points to the Rakudo this CURI belongs to. (That's not guaranteed.)
    • Use absolute paths and provide a tool to update the shebang lines. One then needs to call that script whenever the installation is moved around. (That's what Strawberry Perl does.)
    • Let the wrapper be a shell script or native executable and dynamically determine the rakudo path based on the location of the wrapper. Something like $wrapper_path/../../../bin/rakudo.
  • Custom CURIs: Such CURIs don't belong to a specific rakudo installation, so it's difficult to classify them as relocatable or non-relocatable. The only options I see are:
    • Rely on env and just use the Rakudo that happens to be in the PATH.
    • Provide a tool to set / change the shebang lines. Then it's up to the user to configure the wrappers to his liking.

In every case we want to keep our ability to call a script by passing it to some rakudo: /some/rakudo /path/to/curi/share/perl6/site/bin/my_script.raku.

I tend towards the following:

  • We always have two files per script in bin/. A *.raku file that is not executable and an executable sh/exe wrapper script/program.
  • When building a non-relocatable Rakudo we default to using absolute paths in the standard CURI wrappers.
  • When building a relocatable Rakudo, we default to using relative paths in the standard CURI wrappers.
  • CURIs somehow remember (in some configuration file) how its wrapper scripts should look. The CURI implementation in Rakudo will respect this configuration and generate wrappers accordingly. The goal here is to never mix different wrappers in a specific CURI.
  • Non-standard CURIs default to using the Rakudo found in PATH.

Before I start implementing things, I'd like to get the outline right.
There are two key points that I feel I want to have a second opinion on (because they are unconventional).

  1. Native executables

I want to generate native executables for each script in a CURI on Windows. I believe I can do that by integrating https://sr.ht/~patrickb/Devel-ExecRunnerGenerator/ into the Rakudo core. It's basically a native executable that can be tuned by attaching a config to the end of the binary file to do what we want it to do. My idea is to compile that executable as a part of the Rakudo build process. The executable is 16.5KiB.
Potential problems I see: Dynamically creating executables on Windows might be an anti-virus / Microsoft security nightmare. I do not know if this will be a problem or if it can be worked around.

  1. Two files per script

I want to always put two files in the bin/ folder. A *.raku script, not marked executable, and a sh script (*nix) / .exe (Windows). That's the only way I managed to come up with that gives us the flexibility we need to support all of our use cases.
Potential problems I see: People hate having twice as many files in bin/ or having non-executable files in bin/ or having non-raku files in bin/.

@patrickbkr patrickbkr added the rakudo Big changes to Rakudo label Dec 3, 2024
@patrickbkr patrickbkr changed the title Wrapper scripts in CURI Revamping wrapper scripts in CURI Dec 3, 2024
@ugexe
Copy link
Contributor

ugexe commented Dec 4, 2024

The bat wrappers we generate on Windows don't kick it and never will. To give one example: It's currently impossible to run zef install 'ABC:ver<0.6.13>:authzef:colomon' on Windows. The < and > characters don't survive a trip through CMD-land.

I agree that is unfortunate and fixing it would be beneficial enough to offset the drawbacks listed.

I want to always put two files in the bin/ folder. A *.raku script, not marked executable, and a sh script (*nix) / .exe (Windows).

I'm not clear what purpose the sh script serves. I assume it solves some of the issues outlined earlier but I'm not clear which ones.

Somewhat related is that it is currently possible for someone to have both bin/foo and bin/foo.raku in their distribution. Would we need to disallow that?

@patrickbkr
Copy link
Member Author

patrickbkr commented Dec 4, 2024

I want to always put two files in the bin/ folder. A *.raku script, not marked executable, and a sh script (*nix) / .exe (Windows).

I'm not clear what purpose the sh script serves. I assume it solves some of the issues outlined earlier but I'm not clear which ones.

On Windows they are mandatory, because Windows does not understand shebang lines. On everything else shebang lines are an option. But shebang lines are very limited in what they can do. AFAIK only absolute paths do what they should. So to be able to accommodate for relative paths and searching the interpreter in $PATH one needs a wrapper program. Searching the interpreter in $PATH is possible when using #!/usr/bin/env raku (given there is a /usr/bin/env in place). So only relative paths don't work.
I think we should have relative paths in core/site/vendor in relocatable rakudo builds and use the rakudo found in $PATH in non core/site/vendor CURIs.

Somewhat related is that it is currently possible for someone to have both bin/foo and bin/foo.raku in their distribution. Would we need to disallow that?

Ouch. This sucks. Do you know whether there are any usecases other than rolling your own wrappers? If no, I'd vote for disallow.

@ugexe
Copy link
Contributor

ugexe commented Dec 5, 2024

Ouch. This sucks. Do you know whether there are any usecases other than rolling your own wrappers? If no, I'd vote for disallow.

Potentially to provide not-commonly used public scripts without squatting on certain names more appropriate for more common usage code. Or for making it explicit what the tool is for. For example macOS comes with scandeps.pl which is only for scanning deps for perl stuff (although admittedly there is no conflicting scandeps, just mentioning it might make sense to some authors to include the extension)

@patrickbkr
Copy link
Member Author

Potentially to provide not-commonly used public scripts without squatting on certain names more appropriate for more common usage code. Or for making it explicit what the tool is for. For example macOS comes with scandeps.pl which is only for scanning deps for perl stuff (although admittedly there is no conflicting scandeps, just mentioning it might make sense to some authors to include the extension)

That's an interesting point. I think there are conflicting interests here, which we'll have to trade off.

  • We want to be able to be able to pass the scripts to a raku and they should run.
  • We want absolute, relative and $PATH based raku search logic.
  • We want to give authors full control over what files end up in bin/

In addition I think we don't want to overengineer.

All ideas I can come up with that could potentially satisfy all requirements seem fragile or too magical to me. They are:

  • Make Rakudo detect autogenerated runner scripts, determine which original script it belongs to and run it. This would allow us to exclusively put wrapper programs into bin/ named identically to the original scripts.
  • Try to create a piece of code that is both valid Raku and POSIX shell. Then use that to have the shell wrapper and Raku calling code all in the same file. (On Windows this is impossible.)

More thoughts:

  • To some extent we have already given up on the idea to give authors full control over what files end up in bin/ since we persist the scripts content in our hashed-filename store and put an autogenerated Raku caller program in bin/.
  • On Windows one is already limited in how to name files by Windows itself, because it relies on the filename extension do determine how to run a file.

My current tendency is to give clear limitations and guidelines on how script files are to be named. I.e. in the bin/ folder in a distribution only files with a .raku extension are allowed and they have to contain valid Raku code. Then people know what to expect and we can provide a pretty seamless cross platform experience. To me this feels OK to do, because Raku is a high-level, cross-platform language that massively abstracts over the details of the underlying operating system to provide a good developer experience. Limiting script file names in distributions seems like a change in line with this.

If there are any other ideas how we could solve this / how a good tradeoff could look, please tell!

@ugexe
Copy link
Contributor

ugexe commented Dec 6, 2024

in the bin/ folder in a distribution only files with a .raku extension are allowed

People will still want to be able to run their code without it being installed, ideally with the same program name as if it were installed. For example to install zef you do raku -I. bin/zef install . not raku -I. bin/zef.raku install .

@patrickbkr
Copy link
Member Author

People will still want to be able to run their code without it being installed, ideally with the same program name as if it were installed. For example to install zef you do raku -I. bin/zef install . not raku -I. bin/zef.raku install .

Do you have an idea how to achive that?

I personally don't find this specific inconvenience too bad, because:

  • It's not some/path/to/curi/bin/zef.raku install . but raku -I. some/path/to/curi/bin/zef.raku install .. So it's not possible to call it directly anyways. - Having the .raku extension there is actually pretty consistent, it is a Raku script file after all.
  • Tab autocomplete will pick the right file for you in this case.

One idea that comes to mind: Forbid files ending in .raku in the modules bin/ folder. Then in the CURI generate a .raku file that contains the Raku code to call the script and a file named identically to the original scripts name (with .exe added on Windows) which is executable. But I find this less consistent, because the file to pass to raku -I. is script in the module and script.raku in the CURI.


One more thought: There is this duality that on the one hand we want to provide the convenience of generating sensible wrappers for the module authors' scripts and on the other hand keep the notion that the files in bin/ end up in the installed bin/ folder unmodified. I myself have often times put a shebang line and use lib $*PROGRAM.parent.parent.add('lib'); at the top of my scripts. If we follow this thought consequently, we'd need two bin folders. bin_managed/ and bin_plain/. Files in bin_managed/ get all the wrapper script treatment while files put in bin_plain/ are copied over to the target bin/ folder unmodified or we name that folder bin/ and don't copy those files at all. But:

  • This is definitely overengineered.
  • It's complex: people will keep on mixing up bin_managed/ and bin_plain/ or don't realize there even is such a distinction.
  • It will probably be rarely used.
  • The typical scipt with a shebang line and use lib ... is not a good enough solution, because it won't work on Windows.

@ugexe
Copy link
Contributor

ugexe commented Dec 6, 2024

I personally don't find this specific inconvenience too bad

A huge problem is those specific instructions (zef install) are in use and documented all over the place. For instance all existing rakubrew installations would be broken if bin/zef didn’t exist in zefs repository.

@patrickbkr
Copy link
Member Author

patrickbkr commented Dec 6, 2024 via email

@ugexe
Copy link
Contributor

ugexe commented Dec 6, 2024

Do you have any ideas what to do? I am a bit stumped. It seems that whatever we do, some part of it always sucks.

The only thing that comes to mind is that maybe #393 could help solve it somehow.

The best idea I can come up with is to add some detection logic that checks if a script is valid Raku. If so, assume it had a .raku extension. And then deprecation-periodify our way out for new modules.

Could be hard. Many of my bin scripts are literally just use Foo::CLI; and nothing more. Maybe this problem specifically could be solved by some variant of #393 since we could explicitly mark which bin files are raku in META6.json. Alternatively it is currently safe to assume any script in bin/ is already raku, as the way our current bin wrappers are implemented they don't work with non-raku code (since it invokes the scripts via require /path/to/installed/bin/script).

I just now realize, that we can't just dictate a new module format. We have a large set of existing modules that we can't magically migrate to the new format. So we'll have to support the current format indefinitely, right?

If you're talking about the module format of installed modules? Then that can be changed with the upgrade-repository thing I linked earlier. If you're talking about the module format of individual modules being distributed and installed via zef, then yeah that becomes a bit harder. We nagged people to e.g. update META6.info to META6.json, but zef had already supported both options for a long time so any version of zef that was already installed likely already supported the new name for multiple years. On the other hand we've tried call Build.pm deprecated for years yet that will still have to exist indefinitely.

@patrickbkr
Copy link
Member Author

Alternatively it is currently safe to assume any script in bin/ is already raku, as the way our current bin wrappers are implemented they don't work with non-raku code (since it invokes the scripts via require /path/to/installed/bin/script).

I've started on a proof of concept implementation. That implementation is currently indifferent to whether the script has the .raku extension or not, it will always create a script[.exe]? executable and a script.raku Raku file.

If you're talking about the module format of individual modules being distributed and installed via zef, then yeah that becomes a bit harder.

Yes, that's what I was referring to.

We nagged people to e.g. update META6.info to META6.json, but zef had already supported both options for a long time so any version of zef that was already installed likely already supported the new name for multiple years. On the other hand we've tried call Build.pm deprecated for years yet that will still have to exist indefinitely.

I guess the best we can do is we can encourage people to name their script with a .raku extension, but do the right thing when they don't. The important point is that the files in bin/ are actually Raku files. And that's currently guaranteed. Phew.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rakudo Big changes to Rakudo
Projects
None yet
Development

No branches or pull requests

2 participants