Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sanitization option to host outputs #191

Open
wants to merge 7 commits into
base: develop
Choose a base branch
from
Open

Conversation

gpetretto
Copy link
Contributor

An issue was raised concerning the execution of commands through the Host objects. If some text is added to the output of a remote host when executing a command (e.g. the disk quota) the qtk parsers fail to extract the output of the commands.
This PR adds an option to sanitize the output by prepending and appending and echo of a string to the command execution, both in the stdout and stderr. The outputs are then filtered to only considere the text enclosed inside the occurrences of string.

At the moment there is only an integration test checking that with the sanitization option enabled the standard execution still works. @ml-evs is there any easy way of adding a test that will actually check that if there are additional strings returned the command will still work fine? Otherwise I will just either mock the return of the function or test the sanitize method of BaseHost individually.

@codecov-commenter
Copy link

codecov-commenter commented Sep 25, 2024

Codecov Report

Attention: Patch coverage is 78.26087% with 10 lines in your changes missing coverage. Please review.

Project coverage is 48.01%. Comparing base (29eb2dc) to head (6f1a24a).

Files with missing lines Patch % Lines
src/jobflow_remote/remote/host/base.py 71.87% 6 Missing and 3 partials ⚠️
src/jobflow_remote/jobs/data.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #191      +/-   ##
===========================================
+ Coverage    47.17%   48.01%   +0.83%     
===========================================
  Files           44       44              
  Lines         5314     5353      +39     
  Branches      1164     1171       +7     
===========================================
+ Hits          2507     2570      +63     
+ Misses        2549     2517      -32     
- Partials       258      266       +8     
Files with missing lines Coverage Δ
src/jobflow_remote/config/base.py 82.72% <100.00%> (+0.07%) ⬆️
src/jobflow_remote/remote/host/local.py 62.16% <100.00%> (+2.16%) ⬆️
src/jobflow_remote/remote/host/remote.py 60.71% <100.00%> (+0.81%) ⬆️
src/jobflow_remote/jobs/data.py 67.55% <0.00%> (ø)
src/jobflow_remote/remote/host/base.py 71.42% <71.87%> (+2.57%) ⬆️

... and 2 files with indirect coverage changes

@ml-evs
Copy link
Member

ml-evs commented Sep 25, 2024

At the moment there is only an integration test checking that with the sanitization option enabled the standard execution still works. @ml-evs is there any easy way of adding a test that will actually check that if there are additional strings returned the command will still work fine? Otherwise I will just either mock the return of the function or test the sanitize method of BaseHost individually.

Probably both approaches would make sense? It would be straightforward to add some additional quota-like output on the integration tests (i.e. properly replicating the setup on clusters where it's done at the global bashrc level) but for debugging purposes a mocked unit test would be nice too -- the integration test with bashrc will have to wait until I've finished the other PR (which is close).

Copy link
Member

@davidwaroquiers davidwaroquiers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good to me!
Just one thing about what we discussed, to probably add a specific check test in jf project check that tells the user what to try (i.e. setting sanitize_command to True) in case there is such spurious echos.

tests/integration/test_slurm.py Show resolved Hide resolved
@gpetretto
Copy link
Contributor Author

I changed the way the prepend and append strings are generated, in order to be more brodly compatible with different shells. For future reference, we may also consider executing the commands in the local worker through the bash -c command.

I will wait for confirmation that this still works fine in a real case before mergin.

Concerning the test, if in the meanwhile the PR with the update of the integration tests is merged we can consider adding specific tests for this feature.

@davidwaroquiers
Copy link
Member

I changed the way the prepend and append strings are generated, in order to be more brodly compatible with different shells. For future reference, we may also consider executing the commands in the local worker through the bash -c command.

I will wait for confirmation that this still works fine in a real case before mergin.

Concerning the test, if in the meanwhile the PR with the update of the integration tests is merged we can consider adding specific tests for this feature.

Ok great, I resolved the comment on a possible specific integration test for this (I'm not even sure it is really needed, but let's see anyway after it has been confirmed and the other PR merged).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants