Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to sanitize job name in PBSPro and Grid Engine adapter #172

Closed
ericfranz opened this issue Jan 2, 2020 · 14 comments
Closed
Assignees
Milestone

Comments

@ericfranz
Copy link
Contributor

In some deployments of Grid Engine, special characters such as slash are not allowed in job names. This causes problems when submitting interactive apps, where we use slashes in the job name.

See OSC/ondemand#341 and in particular OSC/ondemand#341 (comment).

Suggested solution OSC/ondemand#341 (comment)

@MorganRodgers
Copy link
Contributor

I disagree that ood_core is the correct place for this. Changes to ood_core would change every job name submitted through OnDemand (assuming the feature flag is set) which would change behavior between using the raw tools (qsub) and OnDemand. I do not think that we want to create that confusion.

Batch Connect applications on the other hand have always had their job names produced programmatically, so there is no danger of creating user-confusion. I do agree that we should still add this as a feature flag so that sites who do not need this change do not experience it.

@ericfranz
Copy link
Contributor Author

We don't currently support the user setting the name of the job. In Job Composer, the "Name" is not sent as an argument. If the user sets the job name in the job script, such as https://github.com/OSC/osc-ood-config/blob/c2571a8b2c1d73c7817d6fa9b97916e6946c9ec1/ondemand.osc.edu/apps/myjobs/templates/Basic_AMBER_Job_Owens/basic_amber.sh#L1, that would be used by the Job Composer, bypassing the suggested change to ood_core.

In the future if we start making it easy for users to change the actual job name through the interface, is your concern they use illegal characters and then we allow that to work because we sanitize that string, where as if they use that directly using qsub it wouldn't work the same way?

@ericfranz
Copy link
Contributor Author

FWIW @viktoriaas indicated desire for this replacement to occur everywhere in OnDemand.

@MorganRodgers
Copy link
Contributor

MorganRodgers commented Jan 3, 2020

In the future if we start making it easy for users to change the actual job name through the interface, is your concern they use illegal characters and then we allow that to work because we sanitize that string, where as if they use that directly using qsub it wouldn't work the same way?

Yes, but also the proposed mutations to the job name are overkill. Imagining a future where OnDemand has a more capable Job Composer I can imagine users who start a job with a name of MyFancyJob-01 will be confused when their job name is converted by OnDemand to myfancyjob_01.

@MorganRodgers
Copy link
Contributor

FWIW @viktoriaas indicated desire for this replacement to occur everywhere in OnDemand.

Is there anywhere else in OnDemand where invalid job names may be constructed?

@johrstrom
Copy link
Contributor

Instead why not just stop using slashes in names? Yes it's a burden on us to be compliant with multiple clusters, but it gives uses a relatively straightforward experience and admins don't run into gotcha's with feature flags that are actually necessary.

@ericfranz
Copy link
Contributor Author

Instead why not just stop using slashes in names?

It is a change in behavior/interface. The job name is the only mechanism we currently have to flag a job as coming from a specific app in OnDemand, besides inspecting the submit host and script contents itself. SciApps may be using it now for metrics, we have used it in the past, other sites might be using it though we don't know about that.

It is the simplest solution though...

@MorganRodgers MorganRodgers changed the title Add ability to sanitize job name in Grid Engine adapter Add ability to sanitize job name in PBSPro and Grid Engine adapter Jan 3, 2020
@MorganRodgers
Copy link
Contributor

After IRL discussion it was decided to go ahead with this as an option to give sites a lever to throw if they notice that they are having problems.

@MorganRodgers
Copy link
Contributor

The Grid Engine job name spec circa 2007 from http://gridscheduler.sourceforge.net/htmlman/htmlman1/sge_types.html?pathrev=V62u5_TAG:

  name
     The name may be any arbitrary alphanumeric ASCII string, but
     may  not contain  "\n", "\t", "\r", "/", ":", "@", "\", "*",
     or "?".

@MorganRodgers
Copy link
Contributor

MorganRodgers commented Jan 3, 2020

PBSPro's job name spec from https://www.pbsworks.com/pdfs/PBSUserGuide18.2.pdf

The job name can be up to 236 characters in length, and must consist of printable, non-whitespace characters.
The first character must be alphabetic, numeric, hyphen, underscore, or plus sign.

@MorganRodgers
Copy link
Contributor

Regex for legal GE job names:
^((?![\n\t\r\/:@\\*])[[:ascii:]])*$

PBSPro:
^[a-zA-Z0-9\-_+]((?!\s).){0,235}$ or ^[a-zA-Z0-9\-_+]((?!\s)[[:print:]]){0,235}$

using negative look ahead to match all ASCII except for a character blacklist.

@ericfranz ericfranz added this to the OOD1.7 milestone Jan 6, 2020
@ericfranz
Copy link
Contributor Author

An Adapter#sanitize_job_name(job_name) method could be added whose default implementation just returns the job name in question. Minimally, that could be used in places where we submit jobs to first sanitize the job name.

@viktoriaas
Copy link

@ericfranz
I am okay with any kind of sanitizing the job name as long as it works. In my opinion, the best would be if we allowed users to create their own job names and OnDemand would just sanitize them as @MorganRodgers proposed.
It is always possible to leave a small comment next to the text box stating, that the string should e.g. "Be at least 7 characters long, starting with small letter, with no '-', '', '/', ... ". I think that users are capable of creating such string and it gives them freedom to mark their jobs. Maybe they are submitting jobs that are somehow connected and they would like to name them as "proteinCalculations1", "proteinCalculations2", "proteinCalculations3" and onDemand just doesn't provide this functionality and names them with some system generated names. It is hard to orientate in these names and also hard to recognize which parts of experiments passed, which failed and so on.
I think that when fixing the problem, it should be fixed in a way that is more efficient although maybe requires more effort. In the end it might be much more better.

@ericfranz
Copy link
Contributor Author

We opt for a simpler solution for now in #183. For sites that have a problem with this they can set OOD_JOB_NAME_ILLEGAL_CHARS env var to / or a set of characters and OnDemand will sanitize those when creating jobs for interactive apps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants