Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jf flow info and other cli updates #23

Merged
merged 3 commits into from
Aug 28, 2023
Merged

jf flow info and other cli updates #23

merged 3 commits into from
Aug 28, 2023

Conversation

gpetretto
Copy link
Contributor

  • new CLI command: jf flow info ID. Since there are few flow related data, aside from those belonging to the single jobs, the output is mainly a table with the jobs belonging to the flow.
  • option to search jobs based on flow_id, name and metadata
  • option to search flow based on name

Copy link
Member

@davidwaroquiers davidwaroquiers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks!
Just one or two small things. Also one possible bug but not directly related to the flow info itself, but I did not try (yet).

@@ -53,7 +54,7 @@ def get_job_data(
load_output: bool = False,
):
fw, remote_run = self.rlpad.get_fw_remote_run_from_id(
job_id=job_id, fw_id=db_id
job_id=job_id, fw_id=db_id, job_index=job_index
)
job = fw.tasks[0].get("job")
state = JobState.from_states(fw.state, remote_run.state if remote_run else None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be JobState.COMPLETED below in the load_output condition ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point. I will fix it

@@ -96,6 +102,9 @@ def _build_query_fw(
or_list.append({FW_UUID_PATH: job_id, FW_INDEX_PATH: job_index})
query["$or"] = or_list

if flow_ids:
query[f"{FW_JOB_PATH}.hosts"] = {"$in": flow_ids}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know currently there is no way to look at inner flows, but correct me if I'm wrong, this would allow to query the jobs of an inner Flow right ? (well of course Fireworks will disappear so if it can be done in the future it would/should be with the new queue)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it will allow to search also based on inner flows. But it will not really disappear. This is the hosts key of the Job object, so it should still be retained even in the future implementation.

@@ -115,6 +124,15 @@ def _build_query_fw(
if locked:
query[f"{REMOTE_DOC_PATH}.{MongoLock.LOCK_KEY}"] = {"$exists": True}

if name:
query["name"] = {"$regex": name}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this will allow something like --name SiO2 to look for all jobs (or flows in the other method) which contains SiO2 right ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duh I wanted to write --name *SiO2* but the stars make it italic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it works even with --name SiO2. I was expecting not, but I tested and it works.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should make it "strict" by default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to switch from one case to the other? Based on the content? I would avoid adding one more option

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely not à new option. Maybe we can always add the start (^) and end ($) regex characters? What do you think?

Copy link
Contributor Author

@gpetretto gpetretto Aug 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean so that if not passing .*SiO2.* it will match exactly? The downside is that it is always doing a regex search, which seems to be worse for performance. So the advantage would only be to be able to avoid some cases where you want to search for "relax 1 SiO2" (quotations required with multiple words) and not having some jobs named tight relax 1 SiO2 in the results as well.
I expect that most of the time one would probably prefer to write SiO2 or relax and live with the fact of getting some more results, rather than using .*SiO2.* but being able to narrow down the job a lot.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference, fnmatch.translate(name) is used to switch between exact match and matches like *SiO2*

flow_db_id_arg = Annotated[
str,
typer.Argument(
help="The ID of the flow. Can the db id (i.e. an integer) or a string (i.e. the uuid)",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
help="The ID of the flow. Can the db id (i.e. an integer) or a string (i.e. the uuid)",
help="The ID of the flow. Can be the db id (i.e. an integer) or a string (i.e. the uuid)",

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it can be the id of the flow or the id of one of the jobs of the flow (if the below job_flow_id_flag is set to True). Should this be specified in the help message here, that both options are possible, depending on the other flag ? There we might consider thinking also in the future how/what this "could" work or do. If you pass a flow id (be it the "main" or an inner flow), you get that specific flow. But if we pass a job id, which flow should it return ? The main one ? The one just above ? Could be chosen ? Could be chosen for "intermediates" ? How ? Last one with "intermediates" seems much more complex and I don't think it would be very useful actually but main or the "just above" one could be interesting. Anyway this is for the future, I just wanted to point it out. Nothing to be done.

@gpetretto gpetretto merged commit c48634b into develop Aug 28, 2023
0 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants