WIP: Flow decorator #547

davidwaroquiers · 2024-02-19T14:10:30Z

Summary

This is a WIP PR trying to develop the flow decorator (see idea #416, also linked to #450). Seems like we could have another option to do what we want. Currently in a (very) proof-of-concept phase (previous attempt with abstract syntax trees failed).

Currently works for "simple" flows (i.e. flows of jobs), and only if the script creates only one flow. I think it should be generalizable, but before spending more time, I'd like to get some feedback from @utf, @Andrew-S-Rosen, @janosh, and maybe others. Basically, it works by adding a reference to the created jobs and flows into a "global" variable (I used the singleton from monty, not a standard python global variable, not exactly sure if it is needed and/or if it helps and/or what it adds in this specific case).

Using what @Andrew-S-Rosen did for having something more user friendly (see #450).

Currently breaks @janosh magic methods.

Another addition should be that the flow output is stored in the database somehow ? (some things to be made sure here though, we don't want to duplicate outputs both in jobs and flows).

Example in test_flows.py:test_flow_decorator

TODO (if any)

If this is a work-in-progress, write something about what else needs to be done.

Generalize to complex flows
Flow outputs in the results database ?
Tests

Checklist

Work-in-progress pull requests are encouraged, but please put [WIP] in the pull request
title.

Before a pull request can be merged, the following items must be checked:

Code is in the standard Python style.
The easiest way to handle this is to run the following in the correct sequence on
your local machine. Start with running black on your new code. This will
automatically reformat your code to PEP8 conventions and removes most issues. Then run
pycodestyle, followed by flake8.
Docstrings have been added in theNumpy docstring format.
Run pydocstyle on your code.
Type annotations are highly encouraged. Run mypy to
type check your code.
Tests have been added for any new functionality or bug fixes.
All linting and tests pass.

Note that the CI system will run all the above checks. But it will be much more
efficient if you already fix most errors prior to submitting the PR. It is highly
recommended that you use the pre-commit hook provided in the repository. Simply
pip install pre-commit and then pre-commit install and a check will be run
prior to allowing commits.

codecov · 2024-02-19T14:26:52Z

Codecov Report

Attention: 3 lines in your changes are missing coverage. Please review.

Comparison is base (eda2a65) 99.42% compared to head (b246ce6) 97.48%.
Report is 4 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #547      +/-   ##
==========================================
- Coverage   99.42%   97.48%   -1.94%     
==========================================
  Files          21       21              
  Lines        1564     1592      +28     
  Branches      425      429       +4     
==========================================
- Hits         1555     1552       -3     
- Misses          9       36      +27     
- Partials        0        4       +4

Files	Coverage Δ
src/jobflow/__init__.py	`100.00% <100.00%> (ø)`
src/jobflow/core/job.py	`97.78% <95.23%> (-2.22%)`	⬇️
src/jobflow/core/flow.py	`89.77% <84.61%> (-10.23%)`	⬇️

Andrew-S-Rosen · 2024-02-20T03:38:33Z

Thank you for kicking this off (again), @davidwaroquiers! As you already know, I'm a huge proponent of this idea in general.

I'll let @utf take the lead on any logistical comments, but from me, I'll just say this is exciting to see. Thanks for baking in my WIP PR in #450 as well!

In general, the UX here is pretty much exactly what I would expect to see.

janosh · 2024-02-20T08:54:07Z

the API this enables looks very appealing! if all edge cases that might negatively affect UX can be addressed, this looks like a big win.

Currently breaks @janosh magic methods.

weird... do you know why?

utf · 2024-02-20T09:39:37Z

Hi @davidwaroquiers, thanks for this PR. This is an interesting idea and implementation.

As I mentioned in #450, I have some concerns with the indexing features. While these can definitely be addressed, I think it would be better to remove those features from this PR so we can focus only on the @flow decorator. I've copied my concerns below.

Regarding the specific implementation, it looks promising. I would like to spend a little more time thinking it over and testing it locally.

Incompatibility with output attributes

One potential confusion is that you aren't able to do something like:
job2 = capitalize(job1.hello)
We could add that functionality in, but it then raises an issue with job1.output e.g., if your output document has an attribute called output this will return the OutputReference and not OutputReference.output.

A potential fix would be to rename all of the Job attributes and functions to attribute_, E.g., job.output_, job.graph_, job.update_maker_kwargs_ etc. This would be a breaking change but not too onerous to fix.

Incompatibility with jobs without a dictionary output

Another issue is with jobs that output a python primitive directly. E.g., your capitalize function. Consider these jobs.
@job
def capitalize(s):
    return s.upper()

@job
def decapitalize(s):
    return s.lower()
This change might think users than can do:
job1 = capitalize("hello")
job2 = decapitalize(job1)
flow = Flow([job1, job2])
But that will fail since jobs can only accept OutputReferences (or other python objects) as inputs.

Andrew-S-Rosen · 2024-03-07T06:23:53Z

@davidwaroquiers: is there anything here that you would like me to prototype or add onto to help keep the ball rolling on this? I can't make any guarantees of course but am happy to try and help where I can.

davidwaroquiers · 2024-03-21T11:26:47Z

@davidwaroquiers: is there anything here that you would like me to prototype or add onto to help keep the ball rolling on this? I can't make any guarantees of course but am happy to try and help where I can.

Not sure I can catch up on this at this stage. It's just that I had some time and I had thought about it so I decided to put it in a WIP PR for discussion. Also, before going on with this implementation, we should sit down and see what we want to have, what's "nice to have" and what's not on the table. While I agree with @utf that we should separate concerns, probably we should still first discuss the user experience. Basically "write down" how we would like to write our workflows, i.e. all cases we are thinking about, and see which are already handled, which would need the flow decorator and which would need the indexing features. I'm open to have a meeting on this if it could make things move.

Andrew-S-Rosen · 2024-08-14T13:48:48Z

Note to self:

Ignoring the changes to __getitem__ here, which are for a separate discussion, the changes implemented by @davidwaroquiers "work" for the simple case of one @flow in a file, but the fact that Steps is a global variable really throws a wrench in things.

For instance:

from jobflow import flow, job, run_locally


@job
def add(a, b):
    return a + b


@job
def multiply(a, b):
    return a * b


@flow
def add_multiply(a, b, c):
    addition = add(a, b)
    return multiply(addition.output, c)


@flow
def double_add(a, b):
    first = add(a, b)
    return add(first.output, first.output)


f = add_multiply(1, 2, 3)
outputs = run_locally(f)
results = [v[1].output for _, v in outputs.items()]
assert results == [3, 9]


f = double_add(1, 2)
outputs = run_locally(f)
results = [v[1].output for _, v in outputs.items()]
assert results == [3, 6]

The call to double_add will fail with ValueError: Job add (6e639178-2deb-4f2c-b28f-e8e47efd2fac) already belongs to another flow, which we can easily understand when investigating Steps. A similar issue would occur if you tried to run the same @flow twice in an interactive Jupyter notebook.

In short, Steps just continues to get populated over and over without getting reset anywhere. Just to be clear, resetting Steps in run_locally(), for instance, will not solve the problem.

It's not immediately clear to me how to handle this when using a global variable, and the previous ast-based approach (while more flexible) seems like it might be too fragile. It's not immediately clear to me how to proceed with a convenient @flow decorator, but it is worth continued thought.

davidwaroquiers added 4 commits February 19, 2024 10:46

Added flow decorator to jobflow.

a5110a6

Merge remote-tracking branch 'upstream/main' into flow_decorator

cca7ede

Testing alternative for @flow decorator.

ee966b5

Disabled tests for magic methods for now.

4511d5b

davidwaroquiers marked this pull request as draft February 19, 2024 14:10

Disabled some tests for now.

b246ce6

janosh added ux User experience api Application programming interface needs discussion Needs discussion labels Feb 20, 2024

Andrew-S-Rosen mentioned this pull request Feb 22, 2024

Allow for the use of @flows with Jobflow Quantum-Accelerators/quacc#1061

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Flow decorator #547

WIP: Flow decorator #547

davidwaroquiers commented Feb 19, 2024

codecov bot commented Feb 19, 2024 •

edited

Loading

Andrew-S-Rosen commented Feb 20, 2024

janosh commented Feb 20, 2024

utf commented Feb 20, 2024

Incompatibility with output attributes

Incompatibility with jobs without a dictionary output

Andrew-S-Rosen commented Mar 7, 2024

davidwaroquiers commented Mar 21, 2024

Andrew-S-Rosen commented Aug 14, 2024 •

edited

Loading

WIP: Flow decorator #547

Are you sure you want to change the base?

WIP: Flow decorator #547

Conversation

davidwaroquiers commented Feb 19, 2024

Summary

TODO (if any)

Checklist

codecov bot commented Feb 19, 2024 • edited Loading

Codecov Report

Andrew-S-Rosen commented Feb 20, 2024

janosh commented Feb 20, 2024

utf commented Feb 20, 2024

Incompatibility with output attributes

Incompatibility with jobs without a dictionary output

Andrew-S-Rosen commented Mar 7, 2024

davidwaroquiers commented Mar 21, 2024

Andrew-S-Rosen commented Aug 14, 2024 • edited Loading

codecov bot commented Feb 19, 2024 •

edited

Loading

Andrew-S-Rosen commented Aug 14, 2024 •

edited

Loading