Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Flow.metadata attribute used in Flow.update_metadata method #679

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

janosh
Copy link
Member

@janosh janosh commented Sep 18, 2024

Flow.update_metadata is now much more similar to and has reached feature parity with Job.update_metadata:

def update_metadata(
self,
update: dict[str, Any],
name_filter: str = None,
function_filter: Callable = None,
dict_mod: bool = False,
dynamic: bool = True,
):
"""

Copy link

codecov bot commented Sep 18, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.24%. Comparing base (a740c6c) to head (367836b).
Report is 14 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #679   +/-   ##
=======================================
  Coverage   99.23%   99.24%           
=======================================
  Files          21       21           
  Lines        1573     1583   +10     
  Branches      427      431    +4     
=======================================
+ Hits         1561     1571   +10     
  Misses         10       10           
  Partials        2        2           
Files with missing lines Coverage Δ
src/jobflow/core/flow.py 100.00% <100.00%> (ø)
src/jobflow/core/job.py 99.15% <100.00%> (-0.01%) ⬇️

@janosh janosh requested a review from utf September 18, 2024 21:52
@janosh janosh changed the title Add Flow.metadata attribute now used in Flow.update_metadata method Add Flow.metadata attribute used in Flow.update_metadata method Sep 18, 2024
@janosh
Copy link
Member Author

janosh commented Sep 27, 2024

@utf i'm sure you're strapped for time but even partial feedback here would be much appreciated!

Copy link
Member

@utf utf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @janosh. My only concern is about breaking the API of update_metadata between Flows and Jobs. If we add the target option to Job.update_metadata that should make this function easier to use and the implementation cleaner.

self.metadata.update(update)

if target in ["jobs", "both"]:
for job in self:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here job could actually be a nested flow. So you should still iterate over these if target = "flow".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps the naming could be improved. target = "flow" isn't meant to update the metadata of all nested flows (to the exclusion of jobs) but rather only the flow itself, not any of its jobs or nested flows.

how about we rename from target: Literal["flow", "jobs", "both"] = "both" to target: Literal["self", "nested", "both"] = "both"? unless you think it's important to have more control, i.e. be able to only update nested Flow or Job metadata. there might be use cases for that. in which case maybe we want target: Literal["self", "nested", "jobs", "flows", "both"] = "both"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to clarify this could be the doc string:

        target
            Specifies where to apply the metadata update. Options are:

            - "self": Update only the Flow's own metadata
            - "nested": Update only the metadata of Jobs/Flows within the Flow
            - "jobs": Update only the metadata of Jobs within the Flow
            - "flows": Update only the metadata of Flows within the Flow
            - "all": Update both the Flow's metadata and nested Job+Flow metadata (default)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is overcomplicating it. If you want to select specific flows or jobs then you can pass in a name or class filter. So no need for the extra options. The main thing is the API should be consistent between jobs/flows.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense. then the easiest thing would be to get rid of target altogether and use the "all" behavior?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an interesting point. I think the benefit of target is that it would be a shortcut for class=Flow vs class=Job. I would be happy either having the target option or specifying the shortcut in the docstring.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

different issue but while we're on the topic of API design, what's your take on adding a callback_filter: Callable[[Flow | Job], bool]? users would pass in a function which takes the Flow or Job instance on which you invoke update_metadata (or update_config, ...) and returns True if updates should be applied. perhaps more prone to user error but also very versatile. usage example:

Flow().update_metadata(
    {"material_id": 42},
    callback_filter=lambda flow: SomeMaker in map(type, flow)
    and flow.name == "flow name"
)

Copy link
Member Author

@janosh janosh Oct 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the benefit of target is that it would be a shortcut for class=Flow vs class=Job. I would be happy either having the target option or specifying the shortcut in the docstring.

i don't follow. by class=Job|Flow, did you mean filter_function=Job|Flow? the job variant filter_function=Job doesn't seem to work the way you're suggesting so maybe you meant sth else?

EDIT: i guess you meant class_filter on Maker.update_kwargs but that isn't implemented for Flow/Job.update_metadata

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was thinking of class_filter. I think the callback_filter you proposed sounds very flexible and would be useful!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants