Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cuesubmit jobs from config file #1284

Open
wants to merge 89 commits into
base: master
Choose a base branch
from

Conversation

KernAttila
Copy link
Contributor

@KernAttila KernAttila commented Mar 29, 2023

Link the Issue(s) this Pull Request is related to.
#1275

Summarize your change.

  • We are now able to declare jobs from the configuration file (see cuesubmit/cuesubmit_config.example.yaml)
  • We can nest configuration files and use env variables.
  • The flag order is respected
  • Widgets are constructed according to each flag type:
  • The yaml syntax allows solo flags, pycue tokens, mandatory flags, no flags arguments and browsable fields.
  • Invalid text fields are red from the start
  • Added some unittests, would love some feedback on those.
  • Each job type can have limits and services preset.

Note

Question

  • Should we deprecate (or remove) the hardcoded commands ?
  • Are you ok with the commands syntax in the example yaml file ?

Screenshots
update_Cuesubmit

@KernAttila KernAttila force-pushed the cuesubmit-jobs-from-config-file branch from 9e85fe0 to 6cfd785 Compare May 29, 2023 09:16
@DiegoTavares
Copy link
Collaborator

This branch is failing on CLA for some reason. @KernAttila I know it has been a while, but can you try following the CLA process again?

@DiegoTavares
Copy link
Collaborator

It looks like some errors emerged after merging master into this branch. Unit tests are failing with:

AttributeError: module 'cuesubmit.Constants' has no attribute 'RENDER_CMDS'

@KernAttila
Copy link
Contributor Author

@DiegoTavares no worries, I'm currently fixing those. It's a bit hard to come back to this code after so long ^^ but we'll get there !
I will let you know when things are ready.

@KernAttila
Copy link
Contributor Author

I tested a lot of scenarios, everything seems up to expectation with this feature.
There's an additional option in the config file: being able to have services and limits presets for each job type.
There is also more tooltips in the UI to help understand each widget.

IMPORTANT NOTE: I added the "Override Cores" feature in this branch as well, it goes with #1296 (rqd) and #1313 (cuebot)
I could move it to another branch if you prefer.

Comment on lines 322 to 330
cores = None
if layer.get_arg("overrideCores"):
logger.info("%s is set to override service cores.", layer.get_name())
if layer.is_arg_set("cores") and layer.is_arg_set("threads"):
logger.warning("%s has both cores and threads. Use cores.", layer.get_name())
sub_element(spec_layer, "cores", "%0.1f" % (layer.get_arg("cores")))
elif layer.get_arg("threads"):
sub_element(spec_layer, "cores", "%0.1f" % (layer.get_arg("threads")))
cores = "%0.1f" % float(layer.get_arg("cores") or layer.get_arg("threads", 0))
sub_element(spec_layer, "cores", cores)
else:
logger.warning("%s will use service cores.", layer.get_name())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking back at this code, it introduces a mandatory argument "overrideCores" that will disrupt existing implementations relying only on pycue.
Another approach could be to test if "cores" is not None, meaning we explicitly override the job's cores value over the service one, thus making current codebases retro-compatible with this feature.
If "cores" is None, then we let the selected service handle that value.
What do you think @DiegoTavares @bcipriano @lithorus ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this change would break our pipeline today. I'd rather have it implemented as you suggested to make it retro-compatible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change is done and tested.
overrideCores stays in cuesubmit.

@DiegoTavares
Copy link
Collaborator

I tested a lot of scenarios, everything seems up to expectation with this feature. There's an additional option in the config file: being able to have services and limits presets for each job type. There is also more tooltips in the UI to help understand each widget.

IMPORTANT NOTE: I added the "Override Cores" feature in this branch as well, it goes with #1296 (rqd) and #1313 (cuebot) I could move it to another branch if you prefer.

I don't see a problem in keeping #1296 and #1313 on this PR, as long as they get approved and merged beforehand

Copy link
Collaborator

@DiegoTavares DiegoTavares left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from the suggested change, this MR is ready to be merged.

Comment on lines 322 to 330
cores = None
if layer.get_arg("overrideCores"):
logger.info("%s is set to override service cores.", layer.get_name())
if layer.is_arg_set("cores") and layer.is_arg_set("threads"):
logger.warning("%s has both cores and threads. Use cores.", layer.get_name())
sub_element(spec_layer, "cores", "%0.1f" % (layer.get_arg("cores")))
elif layer.get_arg("threads"):
sub_element(spec_layer, "cores", "%0.1f" % (layer.get_arg("threads")))
cores = "%0.1f" % float(layer.get_arg("cores") or layer.get_arg("threads", 0))
sub_element(spec_layer, "cores", cores)
else:
logger.warning("%s will use service cores.", layer.get_name())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this change would break our pipeline today. I'd rather have it implemented as you suggested to make it retro-compatible.

DiegoTavares pushed a commit that referenced this pull request Sep 25, 2024
**Link the Issue(s) this Pull Request is related to.**
Fixes #1297

**Summarize your change.**
As in many render engines, we should be able to set a negative core
requirement.
  minCores=8 > reserve 8 cores
  minCores=0 > reserve all cores
  minCores=-2 > reserve all cores minus 2
  
This PR addresses this feature by handling negative core requests.
Cuebot will try to match this number against the number of cores on each
host.
The frame will be booked only if all cores are available in this
scenario.
If the host is busy (even slightly), the frame is **not** booked, to
avoid filling the remaining cores.

**Testing**
I would need some guidance to create proper tests for cuebot.

**Screenshot**

![negative_cores](https://github.com/AcademySoftwareFoundation/OpenCue/assets/5556461/d9c4400c-824a-40cc-9ba9-2f76a3fd8ceb)
Update:
There is now a "ALL" text for zero cores, or "ALL (-2)" for negative
cores reservation.

![core_reservation](https://github.com/user-attachments/assets/88802b15-3ccd-4cb5-90b7-58e532523ae6)
(cuesubmit feature in another PR #1284)

---------

Signed-off-by: Kern Attila GERMAIN <[email protected]>
…g if "cores" is not None. If it has an explicit value (like in existing implementations), this means we want to override the value, which is the default behavior.

Otherwise, we let the server assign the value of selected service.
… the args are set, because checking their value would be False for 0 (all cores)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants