-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
app: Update Forall command to allow multiple concurrent processes #755
base: main
Are you sure you want to change the base?
Conversation
For comparison, timing counting the lines of code using
|
e120eb3
to
9592b45
Compare
9592b45
to
06fd709
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really cool, thanks for designing and testing this! I'm afraid we still have the issue of concurrent outputs though, see below.
self.banner(f'running "{args.subcommand}" in {project.name_and_path}:') | ||
proc = await asyncio.create_subprocess_shell(args.subcommand, | ||
cwd=cwd, env=env, shell=True) | ||
return await proc.wait() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to capture standard outputs from different processes here so they don't interleave concurrently and randomly and become really hard to read. This could even turn into a terminal disaster if they use --color ANSI codes. I think we already touched on this question in #713 and before.
You probably tested this with relatively "quiet" and easy to read output and a reasonable number of threads... can you try again after cranking it all up? This code must be prepared to handle not just the "common" cases but all cases.
You can either use the usual (out, err) = proc.communicate()
. This avoids concurrent terminal outputs from different subprocesses but it assumes process outputs are already line-based.
https://docs.python.org/3/library/asyncio-subprocess.html
So it's probably better to play it safer and use some readline()
variation. I found a couple examples that seem relevant: https://kevinmccarthy.org/2016/07/25/streaming-subprocess-stdin-and-stdout-with-asyncio-in-python/
https://stackoverflow.com/questions/2804543/read-subprocess-stdout-line-by-line
The real icing on the cake would be an option that prefixes each line of the outputs with the project name! BTW:
Eventually, that parallelization and output capture code should be generic enough to be-reused by all commands, not just forall
! And especially west update
where it is ... awaited (pun intended) the most (#713 etc.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I didn't want to dive right in here, but it would be nice to get right from the get-go.
I was dabbling with the idea of having the output behave like ninja
where the output line is replaced with the banner and maybe some counter indicating the progress. And when the subprocesses is done, print its output as is.
Not sure if this would require something like curses
, I think it could be more lightweight.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was dabbling with the idea of having the output behave like ninja where the output line is replaced with the banner and maybe some counter indicating the progress. And when the subprocesses is done, print its output as is
That would be awesome but I think just 1) making sure the output is readable 2) all commands use the same output "framework" would already be a major milestone and great stepping stone towards something better. And it would give what a lot of users have been waiting for: concurrency at last.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important process output questions to address.
src/west/app/project.py
Outdated
continue | ||
|
||
async def run_for_project(self, project, args, sem): | ||
async with sem: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think sem
is really too short. Also, it's used only twice so that does not save much.
async with sem: | |
async with semaphore: |
|
||
cmd('update net-tools Kconfiglib') | ||
|
||
# print order is no longer guaranteed when there are multiple projects |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then the banners don't make sense anymore and should not be printed when j > 1
It's more complicated...
@@ -1670,16 +1671,15 @@ def do_add_parser(self, parser_adder): | |||
parser.add_argument('projects', metavar='PROJECT', nargs='*', | |||
help='''projects (by name or path) to operate on; | |||
defaults to active cloned projects''') | |||
parser.add_argument('-j', '--jobs', nargs='?', const=-1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parser.add_argument('-j', '--jobs', nargs='?', const=-1, | |
# Default to 1 when `-j` is not given because there is no way to | |
# whether the user commands can be run at the same time safely. | |
parser.add_argument('-j', '--jobs', nargs='?', const=-1, |
Eventually, west grep
, west update
and others could default to cpu_count()
if everything goes well but I think forall
should always default to 1.
(such a comment also helps a bit with the peculiar default+const
argparse idiom)
Allow passing a custom end to banner methods. This is useful for example to print a carriage return. Signed-off-by: Pieter De Gendt <[email protected]>
Demonstrate asynchronous behavior for the Forall command and add an argument to select the number of jobs. Signed-off-by: Pieter De Gendt <[email protected]>
Add test cases for running the forall command with multiple processes. Signed-off-by: Pieter De Gendt <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably tested this with relatively "quiet" and easy to read output and a reasonable number of threads... can you try again after cranking it all up? This code must be prepared to handle not just the "common" cases but all cases.
OK, I just had a test idea and this PR was pretty easy to "break" after all. The following one-liner prints readable output with -j 1
and totally jumbled up with -j > 1
west forall -j 1 -c 'i=8; while test $((i--)) -ge 0;
do printf " $WEST_PROJECT_NAME"; sleep 0.1; done; printf "\n"'
It would be great to "upgrade" a test like this and make it part to the actual test suite. I don't know how we could make it portable to Windows.. by converting it to Python maybe? Or maybe we don't need to? I think it would be OK to have some tests skipped on some platforms, there would still be value in that.
Don't get me wrong: we DO need some tests that exert the terminal on Windows. Windows terminal issues was the reason for the 7223431 revert and that whole saga. But maybe not all tests need to run on all platforms.
2634eb5
to
1ef4af1
Compare
The updated version is better, as all printing is done from a single thread, I tried to overwrite the "running" line, but it doesn't clear it. |
Which shell do you use? Doesn't work properly with |
The default/standard: bash. Should be compatible with all Bourne shells: ash, ksh, etc. Strange |
In What does not work? I just checked and A must have for any shell script: https://www.shellcheck.net/ ( |
You just made me realize something important...
"A shell" is vague. On Linux this is apparently the default shell. Why not but this must be more explicit. Can you please update this help text? Also, is this a login shell or not? It matters: https://superuser.com/questions/183870/difference-between-bashrc-and-bash-profile/ Similar problem with Windows: will In the longer term, there should be a new |
So the reason this is "vague" is because 1) it's a mess 2) accordingly, the Python documentation is vague too - intentionally! python/cpython#114539 So I think we only need a warning that defers to Python documentation here. But we do need that warning. Something like "it's a wildly non-portable and insecure mess, check the Python documentation and don't use this in automation". Something like that. |
Demonstrate asynchronous behavior for the Forall command and add an argument to select the number of jobs.
The same idea can be applied to the
Update
command.