Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optional Flag to help Debugging #527

Merged
merged 3 commits into from
Jan 22, 2024
Merged

Conversation

jmmshn
Copy link
Contributor

@jmmshn jmmshn commented Jan 11, 2024

Optional Flag to help Debugging

Currently, whenever any run_locally call fails.
The stack trace is produced at the end and we only have access to the error via a logger even

if you set RAISE_IMMEDIATELY: true, the stack trace will be provided where the excution actually failed.
This allows you to move around better within a debugger.

@utf
Copy link
Member

utf commented Jan 11, 2024

Thanks @jmmshn, this is a useful addition.

I think this flag could be better as an argument to run_locally rather than a general setting, since it is specifically for that function and doesn't impact other managers. Although if there is a particular reason to have it is a general setting, let me know.

@jmmshn
Copy link
Contributor Author

jmmshn commented Jan 11, 2024

The flag is help with situations where the excution is already serialized like in a fireworks wf.
Finding and modifying the exact run_locally command is not always trivial.
So I think the trade off is:

  • argument: easier access to the exact call you want
  • flag: don't have to find and change that exact call, which can be difficult/annoying, this is especially true when debugging things already on fw

In my usages cases the second problem seems more annoying since I'm pretty much dealing with a single run_locally call at any given time so being about to target is not super important.

@utf
Copy link
Member

utf commented Jan 11, 2024

Hi @jmmshn, I'm not sure I follow. What do you mean by "finding the exact run_locally command"? Running jobflow with fireworks never calls run_locally it calls the Job.run() function. This change would only impact workflows that are run via a script with run_locally at the end, hence there should only be one call to that function per workflow.

I guess one potential benefit of the setting vs argument, is that you could change the setting once a job is already submitted and in the queue (but not running).

@jmmshn
Copy link
Contributor Author

jmmshn commented Jan 11, 2024

Hi @jmmshn, I'm not sure I follow. What do you mean by "finding the exact run_locally command"? Running jobflow with fireworks never calls run_locally it calls the Job.run() function. This change would only impact workflows that are run via a script with run_locally at the end, hence there should only be one call to that function per workflow.

OK, I missed that, you are correct, this will not influence the FW stuff only run_locally for now. I thought the FW manager calls the run_locally command but that's not true.

I guess one potential benefit of the setting vs argument, is that you could change the setting once a job is already submitted and in the queue (but not running).

Yeah, I think w/e manager we have in the future there might be situations where you add many jobs to a queue and one fails, it will be useful to just rerun the failed job with the flag turned on in some interactive environment and start exploring the stack trace. So even if this only affects the run_locally command for now this can be added to other ways of executing the wf's in the future depending on who you want them to deal with exceptions. I'm OK with either one so I'll leave the decision up to you.

Copy link

codecov bot commented Jan 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (8ae6ec9) 99.86% compared to head (a76586a) 99.86%.
Report is 3 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #527   +/-   ##
=======================================
  Coverage   99.86%   99.86%           
=======================================
  Files          20       20           
  Lines        1512     1514    +2     
  Branches      416      417    +1     
=======================================
+ Hits         1510     1512    +2     
  Misses          2        2           
Files Coverage Δ
src/jobflow/managers/local.py 100.00% <100.00%> (ø)

@jmmshn
Copy link
Contributor Author

jmmshn commented Jan 19, 2024

OK, I thought about this some more. I think for harder debugging stuff people are gonna be touching the code a lot anyway so it makes more sense to just have it be an argument.

@utf
Copy link
Member

utf commented Jan 22, 2024

Thanks @jmmshn

@utf utf merged commit 9bd99a8 into materialsproject:main Jan 22, 2024
9 checks passed
@utf utf added the enhancement New feature or request label Jan 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants