Add sanitization option to host outputs #191

gpetretto · 2024-09-25T08:15:20Z

An issue was raised concerning the execution of commands through the Host objects. If some text is added to the output of a remote host when executing a command (e.g. the disk quota) the qtk parsers fail to extract the output of the commands.
This PR adds an option to sanitize the output by prepending and appending and echo of a string to the command execution, both in the stdout and stderr. The outputs are then filtered to only considere the text enclosed inside the occurrences of string.

At the moment there is only an integration test checking that with the sanitization option enabled the standard execution still works. @ml-evs is there any easy way of adding a test that will actually check that if there are additional strings returned the command will still work fine? Otherwise I will just either mock the return of the function or test the sanitize method of BaseHost individually.

codecov-commenter · 2024-09-25T08:24:48Z

Codecov Report

Attention: Patch coverage is 78.26087% with 10 lines in your changes missing coverage. Please review.

Project coverage is 48.01%. Comparing base (29eb2dc) to head (6f1a24a).

Files with missing lines	Patch %	Lines
src/jobflow_remote/remote/host/base.py	71.87%	6 Missing and 3 partials ⚠️
src/jobflow_remote/jobs/data.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop     #191      +/-   ##
===========================================
+ Coverage    47.17%   48.01%   +0.83%     
===========================================
  Files           44       44              
  Lines         5314     5353      +39     
  Branches      1164     1171       +7     
===========================================
+ Hits          2507     2570      +63     
+ Misses        2549     2517      -32     
- Partials       258      266       +8

Files with missing lines	Coverage Δ
src/jobflow_remote/config/base.py	`82.72% <100.00%> (+0.07%)`	⬆️
src/jobflow_remote/remote/host/local.py	`62.16% <100.00%> (+2.16%)`	⬆️
src/jobflow_remote/remote/host/remote.py	`60.71% <100.00%> (+0.81%)`	⬆️
src/jobflow_remote/jobs/data.py	`67.55% <0.00%> (ø)`
src/jobflow_remote/remote/host/base.py	`71.42% <71.87%> (+2.57%)`	⬆️

... and 2 files with indirect coverage changes

ml-evs · 2024-09-25T09:19:25Z

At the moment there is only an integration test checking that with the sanitization option enabled the standard execution still works. @ml-evs is there any easy way of adding a test that will actually check that if there are additional strings returned the command will still work fine? Otherwise I will just either mock the return of the function or test the sanitize method of BaseHost individually.

Probably both approaches would make sense? It would be straightforward to add some additional quota-like output on the integration tests (i.e. properly replicating the setup on clusters where it's done at the global bashrc level) but for debugging purposes a mocked unit test would be nice too -- the integration test with bashrc will have to wait until I've finished the other PR (which is close).

davidwaroquiers

All good to me!
Just one thing about what we discussed, to probably add a specific check test in jf project check that tells the user what to try (i.e. setting sanitize_command to True) in case there is such spurious echos.

tests/integration/test_slurm.py

gpetretto · 2024-09-30T11:20:24Z

I changed the way the prepend and append strings are generated, in order to be more brodly compatible with different shells. For future reference, we may also consider executing the commands in the local worker through the bash -c command.

I will wait for confirmation that this still works fine in a real case before mergin.

Concerning the test, if in the meanwhile the PR with the update of the integration tests is merged we can consider adding specific tests for this feature.

davidwaroquiers · 2024-09-30T11:30:22Z

I changed the way the prepend and append strings are generated, in order to be more brodly compatible with different shells. For future reference, we may also consider executing the commands in the local worker through the bash -c command.

I will wait for confirmation that this still works fine in a real case before mergin.

Concerning the test, if in the meanwhile the PR with the update of the integration tests is merged we can consider adding specific tests for this feature.

Ok great, I resolved the comment on a possible specific integration test for this (I'm not even sure it is really needed, but let's see anyway after it has been confirmed and the other PR merged).

gpetretto force-pushed the gp/sanitize branch from a588834 to 991a17d Compare September 27, 2024 08:20

davidwaroquiers approved these changes Sep 27, 2024

View reviewed changes

tests/integration/test_slurm.py Show resolved Hide resolved

gpetretto added 5 commits September 30, 2024 10:16

add sanitization option to host outputs

de81d3f

add mock tests

5fcda60

fix sanitization

586e7cf

fix test

a72ddd3

update sanitization to fit more cases

ef7e3a5

gpetretto force-pushed the gp/sanitize branch from 991a17d to ef7e3a5 Compare September 30, 2024 09:49

fix tests

3e9f4f8

fix estimated run time

6f1a24a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add sanitization option to host outputs #191

Add sanitization option to host outputs #191

gpetretto commented Sep 25, 2024

codecov-commenter commented Sep 25, 2024 •

edited

Loading

ml-evs commented Sep 25, 2024

davidwaroquiers left a comment

gpetretto commented Sep 30, 2024

davidwaroquiers commented Sep 30, 2024

Add sanitization option to host outputs #191

Are you sure you want to change the base?

Add sanitization option to host outputs #191

Conversation

gpetretto commented Sep 25, 2024

codecov-commenter commented Sep 25, 2024 • edited Loading

Codecov Report

ml-evs commented Sep 25, 2024

davidwaroquiers left a comment

Choose a reason for hiding this comment

gpetretto commented Sep 30, 2024

davidwaroquiers commented Sep 30, 2024

codecov-commenter commented Sep 25, 2024 •

edited

Loading