Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

esheldon fork Tickets/dm 40513 #24

Draft
wants to merge 19 commits into
base: tickets/DM-40513
Choose a base branch
from

Conversation

esheldon
Copy link

No description provided.

TallJimbo and others added 7 commits August 25, 2023 12:13
MultipleCellCoadd isn't iterable, but its .cells attribute is a
mapping.
The Input name is set correctly to cellCoadd

temporarily commented object schema "cT.InitOutput" because this
doesn't work yet

temporarilyl set required_bands to r, need to learn how to
set with config
Currently the real cell coadds don't have everything we need
@esheldon esheldon marked this pull request as draft August 31, 2023 03:04
The metadetect code expects MultiBandExposure
@esheldon
Copy link
Author

esheldon commented Aug 31, 2023

Short term todo items:

  • figure out to configure required bands. config entry exists but is not working)
  • configure simulate mode vs data mode
  • add metadetect config options (might be for a later PR)
  • Use real mfrac when it becomes available in cell coadd
  • Use real noise image when it becomes available in cell coadd
  • Unit testing

Copy link
Member

@arunkannawadi arunkannawadi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jim and I did a quick first pass and will do a more careful one separately. We will refrain from commenting on column names in this PR.

nullable=False,
metadata={
"doc": "admom wmom T (<x^2> + <y^2>) measurement for PSF.",
"unit": "",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to add actual units here and wherever applicable. So this would be "pixels", I guess?

},
),
pa.field(
"wmom_band_flux_1",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the _1 here signify?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_1 means flux in the first filter

We will be measuring the flux in N filters and the filters will be configurable.

We can translate these to filter names

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should translate to filter names.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to do this using the required_bands config option list

single_cell_tables: list[pa.Table] = []
for single_cell_coadds in zip(
for cell_coadds in zip(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should stay as single_cell_coadds to be evident what class it is. In metadetect/DESC code base, it should be okay to simplify it as cell_coadds if you prefer.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the type is documented

@@ -117,14 +124,20 @@ class MetadetectionShearConfig(
):
"""Configuration definition for MetadetectionShearTask."""

from lsst.meas.base import SkyMapIdGeneratorConfig

required_bands = ListField[str](
"Bands expected to be present. Cells with one or more of these bands "
"missing will be skipped. Bands other than those listed here will "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this docstring is consistent with what happens. If g is specified by the coadd data doesn't exist, I think a NoWorkFound gets raised and nothing gets processed. I think that's the correct behavior and the docstring needs to reflect that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jim wrote that doc string, maybe he can comment

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote what I expect to happen. Happy to have it changed to what will happen.

@TallJimbo
Copy link
Member

TallJimbo commented Sep 8, 2023

I've created DM-40698 to look into why the command-line config overrides are not being applied; I'm considering that a middleware bug until I can prove otherwise.

('cell_y', 'u1'),
('shear_type', 'U2'),
('mask_frac', 'f4'),
('primary', bool),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see these items in the schema above?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ohhh nvm

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note also that the code that creates the final output on disk seems to ignore fields that are not in the schema.

@esheldon
Copy link
Author

esheldon commented Sep 8, 2023

Jim your link is wrong, should be https://jira.lsstcorp.org/browse/DM-40698

Erin Sheldon added 2 commits September 13, 2023 08:33
We currently run with a config restricting to r for
real data, as other bands don't yet exist for the
testbed

Note we will want to process g separately, this is
still TODO
currently the sims cannot generate the bright objects
because the catalog is not available

Also when using real data we need to pass in the info for
bright objects to be masked
@esheldon
Copy link
Author

Is there a way to have an input that can be used as a placeholder for the star catalog? It does not have to represent a real catalog.

@TallJimbo
Copy link
Member

I can add a connection that will load Gaia or Pan-STARRS (when running on real data) or the DC2 refcat. I'm not super familiar with that part of the codebase but it shouldn't take me long to figure out. Should have something later today or maybe Thursday.

@esheldon
Copy link
Author

Thanks Jim.

I just want to point out the practical issue of memory limitations, and the possible need of loading a subset of that data (either columns or rows/spatially)

@arunkannawadi
Copy link
Member

Something like this might work (change PrequisiteInput to Input though)?
https://github.com/lsst/pipe_tasks/blob/cd254cd47fbeeb61f33e2bcc8ec0f0bcccba9f76/python/lsst/pipe/tasks/calibrate.py#L86-L102

    astromRefCat = cT.PrerequisiteInput(
        doc="Reference catalog to use for astrometry",
        name="gaia_dr3_20230707",
        storageClass="SimpleCatalog",
        dimensions=("skypix",),
        deferLoad=True,
        multiple=True,
    )

    photoRefCat = cT.PrerequisiteInput(
        doc="Reference catalog to use for photometric calibration",
        name="ps1_pv3_3pi_20170110",
        storageClass="SimpleCatalog",
        dimensions=("skypix",),
        deferLoad=True,
        multiple=True
    )

@TallJimbo
Copy link
Member

Yes, that's what we'd want for the connections, and it'd need to remain a prerequisite (I don't think that's a problem; we'll always have these in hand before we start DR processing). The part I don't remember is how to use the objects that do the spatial filtering and concatenation to get just the relevant shards for a particular data ID.

Erin, the spatial filtering is taken care of - these catalogs are sharded by HTM across the sky and the middleware that runs the tasks has information to grab just the shards it needs for each patch. And then we have some in-memory filtering (the stuff I need to look up) to shrink it down further.

I don't know what sort of column-filtering will be possible, at least on read, since these are all stored in FITS binary tables, but they are all preprocessed to have a similar set of columns across all reference catalogs and that set of columns is typically much smaller than the upstream catalog, so there may not be a need.

@TallJimbo
Copy link
Member

Ok, I've (fast-forward) merged this branch into tickets/DM-40513, rebased that on main, and added some (untested) code to load a reference catalog; last commit has some details on that. I was expecting GitHub to recognize the fast-forward-merge-and-rebase, but it seems it just thinks it's a conflict; maybe if @esheldon rebases his branch on mine it will?

@esheldon
Copy link
Author

Jim, I don't understand. All I see in the other PR is a linter fail

@TallJimbo
Copy link
Member

TallJimbo commented Oct 20, 2023

You should be able to find rebased versions of all of your commits on PR #22 now, too, as well as three new ones by me; two are mechanical linter things along with bbe1965.

If you pull those changes to your fork and either to a git rebase or a git rebase --hard to synchronize them, and then push, GitHub might auto-close this PR since there won't be any changes on it, but as soon as you add another commit on top of mine we can open it up to keep the discussion here (if we want).

@esheldon
Copy link
Author

on PR #22 it doesn't mention a conflict. What is it we are trying to fix?

@TallJimbo
Copy link
Member

I was just talking about the This branch has conflicts that must be resolved on this PR (#24).

@esheldon
Copy link
Author

Yes, we are using /repo/dc2.

However when I try cal_ref_cat_2_2 I get

  File "/sdf/home/e/esheldon/miniconda3/envs/stack/share/eups/Linux64/pipe_base/gd943e6a5b3+5e7a48a6da/python/lsst/pipe/base/quantum_graph_builder.py", line 555, in _resolve_task_quanta
    quantum_key, self._find_removed(skeleton.iter_inputs_of(quantum_key), helper.inputs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/sdf/home/e/esheldon/miniconda3/envs/stack/share/eups/Linux64/pipe_base/gd943e6a5b3+5e7a48a6da/python/lsst/pipe/base/quantum_graph_builder.py", line 1103, in _find_removed
    result.remove(DatasetKey(parent_dataset_type_name, kept_ref.dataId.values_tuple()))
KeyError: DatasetKey(parent_dataset_type_name='cal_ref_cat_2_2', data_id_values=(147079,), is_task=False, is_prerequisite=False)

Here is my command line

pipetask run \
    -b /sdf/data/rubin/repo/dc2 \
    -i u/kannawad/DM-39243 \
    -o u/$USER/mdetTest \
    --task lsst.drp.tasks.metadetection_shear.MetadetectionShearTask \
    -d "tract=3828 AND patch=42 AND band='r' AND skymap='DC2_cells_v1'" \
    -C metadetectionShear:metadetection_shear_config.py

@arunkannawadi
Copy link
Member

Trying to reproduce this now. Can you post the full traceback?

@esheldon
Copy link
Author

Traceback (most recent call last):
  File "/sdf/home/e/esheldon/miniconda3/envs/stack/share/eups/Linux64/ctrl_mpexec/g17a9813c4d+17770510cb/python/lsst/ctrl/mpexec/cli/cmd/commands.py", line 210, in run
    if (qgraph := script.qgraph(pipelineObj=pipeline, **kwargs, show=show)) is None:
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/sdf/home/e/esheldon/miniconda3/envs/stack/share/eups/Linux64/ctrl_mpexec/g17a9813c4d+17770510cb/python/lsst/ctrl/mpexec/cli/script/qgraph.py", line 221, in qgraph
    qgraph = f.makeGraph(pipelineObj, args)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/sdf/home/e/esheldon/miniconda3/envs/stack/share/eups/Linux64/ctrl_mpexec/g17a9813c4d+17770510cb/python/lsst/ctrl/mpexec/cmdLineFwk.py", line 650, in makeGraph
    qgraph = graphBuilder.makeGraph(
             ^^^^^^^^^^^^^^^^^^^^^^^
  File "/sdf/home/e/esheldon/miniconda3/envs/stack/share/eups/Linux64/pipe_base/gd943e6a5b3+5e7a48a6da/python/lsst/pipe/base/graphBuilder.py", line 174, in makeGraph
    return qgb.build(metadata)
           ^^^^^^^^^^^^^^^^^^^
  File "/sdf/home/e/esheldon/miniconda3/envs/stack/share/eups/Linux64/utils/g384e8880d6+81bc2a20b4/python/lsst/utils/timer.py", line 295, in timeMethod_wrapper
    res = func(self, *args, **keyArgs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/sdf/home/e/esheldon/miniconda3/envs/stack/share/eups/Linux64/pipe_base/gd943e6a5b3+5e7a48a6da/python/lsst/pipe/base/quantum_graph_builder.py", line 360, in build
    self._resolve_task_quanta(task_node, full_skeleton)
  File "/sdf/home/e/esheldon/miniconda3/envs/stack/share/eups/Linux64/utils/g384e8880d6+81bc2a20b4/python/lsst/utils/timer.py", line 295, in timeMethod_wrapper
    res = func(self, *args, **keyArgs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/sdf/home/e/esheldon/miniconda3/envs/stack/share/eups/Linux64/pipe_base/gd943e6a5b3+5e7a48a6da/python/lsst/pipe/base/quantum_graph_builder.py", line 555, in _resolve_task_quanta
    quantum_key, self._find_removed(skeleton.iter_inputs_of(quantum_key), helper.inputs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/sdf/home/e/esheldon/miniconda3/envs/stack/share/eups/Linux64/pipe_base/gd943e6a5b3+5e7a48a6da/python/lsst/pipe/base/quantum_graph_builder.py", line 1103, in _find_removed
    result.remove(DatasetKey(parent_dataset_type_name, kept_ref.dataId.values_tuple()))
KeyError: DatasetKey(parent_dataset_type_name='cal_ref_cat_2_2', data_id_values=(147079,), is_task=False, is_prerequisite=False)

@TallJimbo
Copy link
Member

Any changes from the currently-pushed branch in your local copy? That is_prerequisite=False is suspicious, but the fact that you got this kind of error at all probably means a bug in some middleware code, even if it's being triggered by some task definition problem. I'll try to reproduce, too.

@esheldon
Copy link
Author

No changes other than putting in the new name

@esheldon
Copy link
Author

I'm on stackvana corresponding to 0.2023.42

@arunkannawadi
Copy link
Member

arunkannawadi commented Oct 30, 2023

I get only an empty quantum graph error, not the one that Erin posted. I used Erin's environment to run this:

(stack) [kannawad@sdfrome002 python]$ pipetask run -b /sdf/data/rubin/repo/dc2 -i u/kannawad/DM-39243 -o u/$USER/mdetTest --task lsst.drp.tasks.metadetection_shear.MetadetectionShearTask -d "tract=3828 AND patch=42 AND band='r' AND skymap='DC2_cells_v1'"
lsst.pipe.base.quantum_graph_builder INFO: Processing pipeline subgraph 1 of 1 with 1 task(s).
lsst.pipe.base.quantum_graph_builder INFO: Iterating over query results to associate quanta with datasets.
lsst.pipe.base.quantum_graph_builder INFO: Initial bipartite graph has 1 quanta, 6 dataset nodes, and 4 edges from 1 query row(s).
lsst.pipe.base.quantum_graph_builder INFO: Dropping task metadetectionShear because no quanta remain (1 had no work to do).
Error: QuantumGraph was empty; CRITICAL logs above should provide details.
(stack) [kannawad@sdfrome002 python]$ which pipetask
/sdf/home/e/esheldon/miniconda3/envs/stack/share/eups/Linux64/ctrl_mpexec/g17a9813c4d+17770510cb/bin/pipetask

@arunkannawadi
Copy link
Member

arunkannawadi commented Oct 30, 2023

What's in the metadetection_shear_config.py? AFAIK, it's only specifying the required bands, right? That's the only difference between the command I ran and the one Erin ran.

@esheldon
Copy link
Author

config.required_bands = ['r']

@TallJimbo
Copy link
Member

I've reproduced Erin's error with Arun's command and a config-file with the required_bands override (that's also where I put the ref_cat name override, but I imagine that's irrelevant). I'm using the shared-stack version of w_2023_42 (so it's not Stackvana; that was always unlikely) with the module-level metadetect import commented out (just because that's not needed for the stuff that's failing to run now).

Debugging now.

@arunkannawadi
Copy link
Member

arunkannawadi commented Oct 30, 2023

Looks like a middleware bug Jim. This happens only when required_bands = ["r"], either via a config override or by setting that as the config default and not if required_bands = ["g", "r", "i", "z"]

@arunkannawadi
Copy link
Member

arunkannawadi commented Oct 30, 2023

These lines are the suspect I think (Edit: nope, never mind)

       bands_missing = set(self.config.required_bands)
        adjusted_input_coadds = []
        for ref in original_input_coadds:
            if ref.dataId["band"] in bands_missing:
                adjusted_input_coadds.append(ref)
                bands_missing.remove(ref.dataId["band"])

@TallJimbo
Copy link
Member

TallJimbo commented Oct 30, 2023

That's what triggers it here, but the bug is in pipe_base. Fix is now on branch tickets/DM-41486 of pipe_base. That seems compatible with w_2023_42, and it seems to fix the problem (I've successfully built the QG with that combination, at least).

@arunkannawadi
Copy link
Member

Quantum graph builds, and we can another error here:

ERROR 2023-10-30T09:53:31.891-07:00 lsst.ctrl.mpexec.singleQuantumExecutor (metadetectionShear:{skymap: 'DC2_cells_v1', tract: 3828, patch: 42})(singleQuantumExecutor.py:266) - Execution of task 'metadetectionShear' on quantum {skymap: 'DC2_cells_v1', tract: 3828, patch: 42} failed. Exception TypeError: ReferenceObjectLoader.loadRegion() missing 1 required positional argument: 'filterName'
ERROR 2023-10-30T09:53:32.000-07:00 lsst.ctrl.mpexec.mpGraphExecutor ()(mpGraphExecutor.py:509) - Task <TaskDef(lsst.drp.tasks.metadetection_shear.MetadetectionShearTask, label=metadetectionShear) dataId={skymap: 'DC2_cells_v1', tract: 3828, patch: 42}> failed; processing will continue for remaining tasks.
Traceback (most recent call last):
  File "/sdf/home/e/esheldon/miniconda3/envs/stack/share/eups/Linux64/ctrl_mpexec/g17a9813c4d+17770510cb/python/lsst/ctrl/mpexec/mpGraphExecutor.py", line 479, in _executeQuantaInProcess
    self.quantumExecutor.execute(qnode.taskDef, qnode.quantum)
  File "/sdf/home/e/esheldon/miniconda3/envs/stack/share/eups/Linux64/ctrl_mpexec/g17a9813c4d+17770510cb/python/lsst/ctrl/mpexec/singleQuantumExecutor.py", line 167, in execute
    result = self._execute(taskDef, quantum)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/sdf/home/e/esheldon/miniconda3/envs/stack/share/eups/Linux64/ctrl_mpexec/g17a9813c4d+17770510cb/python/lsst/ctrl/mpexec/singleQuantumExecutor.py", line 264, in _execute
    self.runQuantum(task, quantum, taskDef, limited_butler)
  File "/sdf/home/e/esheldon/miniconda3/envs/stack/share/eups/Linux64/ctrl_mpexec/g17a9813c4d+17770510cb/python/lsst/ctrl/mpexec/singleQuantumExecutor.py", line 466, in runQuantum
    task.runQuantum(butlerQC, inputRefs, outputRefs)
  File "/sdf/data/rubin/user/kannawad/drp_tasks/python/lsst/drp/tasks/metadetection_shear.py", line 562, in runQuantum
    ref_cat = ref_loader.loadRegion(qc.quantum.dataId.region)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: ReferenceObjectLoader.loadRegion() missing 1 required positional argument: 'filterName'

I might be able to tackle this one.

@arunkannawadi
Copy link
Member

The task runs without any errors now. I have arbitrarily set the filterName when loading the reference catalog as r (i.e., reference flux field). This is a hack for now, and when Jim is back in full swing, we can get back to deciding what it ought to be.

@esheldon
Copy link
Author

I didn't know Jim was out of commission, sorry to hear that, whatever the reason is.

@@ -559,7 +564,8 @@ def runQuantum(
config=self.config.ref_loader,
log=self.log,
)
ref_cat = ref_loader.loadRegion(qc.quantum.dataId.region)
# What should decide the filterName?
ref_cat = ref_loader.loadRegion(qc.quantum.dataId.region, filterName="r") # THIS IS A HACK.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it says it's a hack but can't you get the band from the dataId?

Copy link
Member

@arunkannawadi arunkannawadi Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a multi-band task, so qc.quantum.dataId doesn't have a band.

Copy link
Member

@timj timj Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You do know all possible bands in play though because you calculate them in the next line. I have no idea how you guess which of the dataset ref bands is the one that you choose for the refcat loader though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly. I don't know either. And it might not matter at all for this purpose if the downstream code (which is metadetect here) does not rely on a reference band for fluxes (which I don't think it does)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct, it does not

super().setDefaults()
# This is a DC2/cal_ref_cat_2_2 specific hack. This should be ideally specified in a config file
# To be removed in the cleanup before merging to main
self.ref_loader.filterMap = {band: f"lsst_{band}_smeared" for band in self.required_bands}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually not a hack as the comment above says. What I meant is that the mapping here is specific to the reference catalog loaded.

@esheldon
Copy link
Author

esheldon commented Nov 1, 2023

I got the new pipe_base and latest changes to this PR. However I am still seeing the QuantumGraph error. Was there another code updated?

lsst.pipe.base.quantum_graph_builder INFO: Processing pipeline subgraph 1 of 1 with 1 task(s).
lsst.pipe.base.quantum_graph_builder INFO: Iterating over query results to associate quanta with datasets.
lsst.pipe.base.quantum_graph_builder INFO: Initial bipartite graph has 1 quanta, 6 dataset nodes, and 4 edges from 1 query row(s).
lsst.pipe.base.quantum_graph_builder INFO: Dropping task metadetectionShear because no quanta remain (1 had no work to do).
Error: QuantumGraph was empty; CRITICAL logs above should provide details.

@esheldon
Copy link
Author

esheldon commented Nov 1, 2023

OK, the error occurs with config.required_bands = ['g', 'r', 'i', 'z']
It runs after changing to config.required_bands = ['r']

@arunkannawadi
Copy link
Member

Yes, I was going to comment just that because I had seen that error when I allowed for more bands than available on the coadd.

We need to make this configurable for different repos.  E.g.
for /repo/main we would need gaia_dr3_20230707

Removed temporary break statement added for testing

Fixed flake8 complaints
@@ -65,9 +65,14 @@ class MetadetectionShearConnections(PipelineTaskConnections, dimensions={"patch"
dimensions={"patch", "band"},
)

# TODO: make "name" configurable, as it will depend on
# the repo we are using
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are already configurable from the pipeline yaml file which are specific to what repo we are running this against. cal_ref_cat_2_2 is just the default value in the absence of a pipeline file.

@esheldon
Copy link
Author

esheldon commented Nov 1, 2023

Should I be using a pipeline file?

@arunkannawadi
Copy link
Member

arunkannawadi commented Nov 1, 2023

A pipeline file is not needed for testing right now and the current command line invocation should be fine. That was just an FYI that the name is already configurable if it is not apparent right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants