DM-38041: rewrite pre-exec-init logic to work without QGs and respect storage class differences #444

TallJimbo · 2024-09-10T16:21:32Z

Checklist

ran Jenkins
added a release note for user-visible changes to doc/changes

codecov · 2024-09-10T16:24:29Z

Codecov Report

Attention: Patch coverage is 96.39831% with 17 lines in your changes missing coverage. Please review.

Project coverage is 83.52%. Comparing base (3efb6ca) to head (9a3e471).
Report is 17 commits behind head on main.

Files with missing lines	Patch %	Lines
...n/lsst/pipe/base/pipeline_graph/_pipeline_graph.py	88.33%	8 Missing and 6 partials ⚠️
python/lsst/pipe/base/graph/graph.py	96.51%	2 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #444      +/-   ##
==========================================
+ Coverage   82.30%   83.52%   +1.22%     
==========================================
  Files          96       97       +1     
  Lines       10971    11426     +455     
  Branches     2092     2192     +100     
==========================================
+ Hits         9030     9544     +514     
+ Misses       1590     1528      -62     
- Partials      351      354       +3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

andy-slac

Looks great, few minor questions. I only looked superficially at the unit tests.

andy-slac · 2024-09-11T17:33:38Z

python/lsst/pipe/base/pipeline_graph/_tasks.py

+            Whether this task can run without any datasets for the given
+            connection.
+        """
+        return getattr(self.get_connections(), "minimum", 1) == 0


Hmm, I'm confused here, should it be getattr(getattr(self.get_connections(), connection_name), "minimum", 1)? Or maybe even a bit more type-safe with isinstance(..., BaseInput)?

👍, I've switched to a version that uses isinstance(..., BaseInput) as that's a lot more readable than nested getattr calls, too.

andy-slac · 2024-09-11T17:43:59Z

python/lsst/pipe/base/pipeline_graph/_pipeline_graph.py

+        Previously recorded package versions.  Updated in place to include
+        any new packages that weren't present before.


Maybe add that it is only updated if existing versions are consistent (no exception raised)?

andy-slac · 2024-09-11T17:52:32Z

python/lsst/pipe/base/pipeline_graph/_pipeline_graph.py

+        if (ref := butler.find_dataset(self.packages_dataset_type)) is not None:
+            packages = butler.get(ref)


I hope we never have a situation with multiple concurrent clients trying to execute this code, it is full of races 🙂

💯 I think we've been pretty clear about pre-exec-init not being something you can parallelize.

andy-slac · 2024-09-11T17:58:51Z

python/lsst/pipe/base/pipeline_graph/_pipeline_graph.py

+        # We do writes at the end to minimize the mess we leave behind when we
+        # raise an exception.


Would it be better to delay all writes, including packages and init outputs, in case one of these three methods raises?

I wondered about that too, but I think the extra complexity involved just isn't worth it, since this kind of mess hasn't really been a problem in practice that I'm aware of.

python/lsst/pipe/base/pipeline_graph/_pipeline_graph.py

python/lsst/pipe/base/graph/graph.py

andy-slac · 2024-09-11T18:46:55Z

tests/test_init_output_run.py

+    @contextmanager
+    def make_butler(self) -> Iterator[Butler]:
+        """Wrap a temporary local butler repository in a context manager."""
+        with tempfile.TemporaryDirectory() as root:


Maybe worth adding ignore_cleanup_errors=False, I think we saw random cleanup failures that resulted in exceptions which you probably want to ignore in tests.

Thanks for the reminder. I actually intentionally went with tempfile.TemporaryDirectory over lsst.utils.tests.temporaryDirectory because I knew the standard library could now do that (and the only reason we invented our own was that the standard library couldn't before), and then I forgot to actually use that flag 🤦‍♂️.

tests/test_init_output_run.py

This does not use TaskFactory (which is in ctrl_mpexec anyway) because it takes a lot more care with storage class conversions and components, and it saves all writes until after it's done all checks, in order to reduce the chance that we leave a repository messy when we encounter an error.

The old ones that take TaskDef will be deprecated on DM-40442.

This code was moved here from ctrl.mpexec.CmdLineFwk.preExecInitQBB with minimal changes.

This reimplements functionality previously in ctrl_mpexec's PreExecInit by delegating to the new PipelineGraph.instantiate_tasks instead. PreExecInit.saveInitOutputs now delegates to this method.

timj mentioned this pull request Sep 10, 2024

DM-38041: delegate to new pre-exec-init implementations in pipe_base lsst/ctrl_mpexec#308

Merged

2 tasks

TallJimbo force-pushed the tickets/DM-38041 branch 2 times, most recently from d787880 to 048018f Compare September 10, 2024 20:15

andy-slac approved these changes Sep 11, 2024

View reviewed changes

TallJimbo added 7 commits September 12, 2024 10:40

Add method to test whether a connection is optional.

0f99ac1

Move dataset type registration logic from PreExecInit to PipelineGraph.

eba7aac

Use task labels rather than TaskDefs as keys in QG in init maps.

5ed0199

Add methods to get init refs from QG without TaskDef.

8cc5a38

The old ones that take TaskDef will be deprecated on DM-40442.

Add convenience method to build QBB for pre-exec-init to QG.

13d6283

This code was moved here from ctrl.mpexec.CmdLineFwk.preExecInitQBB with minimal changes.

Rework QG-based init-output writing as a QG method.

ce13fc2

This reimplements functionality previously in ctrl_mpexec's PreExecInit by delegating to the new PipelineGraph.instantiate_tasks instead. PreExecInit.saveInitOutputs now delegates to this method.

TallJimbo force-pushed the tickets/DM-38041 branch from 0634922 to 17ca159 Compare September 12, 2024 14:40

TallJimbo added 9 commits September 12, 2024 10:42

Implement QG-free version of pre-exec-init's config writing.

db9c870

Rework QG-based config writing as a QG method.

6e0feb7

Implement QG-free version of pre-exec-init's packages writing.

ff8a123

Rework QG-based package versions writing as a QG method.

ad90609

Add convenience methods for initializing output runs.

87dcde6

Minor clarification in code comment.

23e361d

Fix docstring typo.

ee18f67

Add tests for init_output_run and related methods.

9f0f3b6

Add changelog entry.

9a3e471

TallJimbo force-pushed the tickets/DM-38041 branch from 7609498 to 9a3e471 Compare September 12, 2024 14:42

TallJimbo merged commit 8292ef4 into main Sep 12, 2024
12 of 13 checks passed

TallJimbo deleted the tickets/DM-38041 branch September 12, 2024 17:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DM-38041: rewrite pre-exec-init logic to work without QGs and respect storage class differences #444

DM-38041: rewrite pre-exec-init logic to work without QGs and respect storage class differences #444

TallJimbo commented Sep 10, 2024 •

edited

Loading

codecov bot commented Sep 10, 2024 •

edited

Loading

andy-slac left a comment

andy-slac Sep 11, 2024

TallJimbo Sep 12, 2024

andy-slac Sep 11, 2024

andy-slac Sep 11, 2024

TallJimbo Sep 12, 2024

andy-slac Sep 11, 2024

TallJimbo Sep 12, 2024

andy-slac Sep 11, 2024

TallJimbo Sep 12, 2024

		Previously recorded package versions. Updated in place to include
		any new packages that weren't present before.

		if (ref := butler.find_dataset(self.packages_dataset_type)) is not None:
		packages = butler.get(ref)

		# We do writes at the end to minimize the mess we leave behind when we
		# raise an exception.

DM-38041: rewrite pre-exec-init logic to work without QGs and respect storage class differences #444

DM-38041: rewrite pre-exec-init logic to work without QGs and respect storage class differences #444

Conversation

TallJimbo commented Sep 10, 2024 • edited Loading

Checklist

codecov bot commented Sep 10, 2024 • edited Loading

Codecov Report

andy-slac left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TallJimbo commented Sep 10, 2024 •

edited

Loading

codecov bot commented Sep 10, 2024 •

edited

Loading