Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-43007: Annotate DP0.2 Object table with principal columns and a column ordering #302

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

gpdf
Copy link
Collaborator

@gpdf gpdf commented Jan 15, 2025

Checklist

When making changes to YAML files in the schemas directory:

  • If applicable, incremented the schema version number, following the guidelines in the contribution guide
  • Referred to the documentation on specific schemas for additional versioning information, change constraints, or tasks that may need to be performed, based on which schema is being updated

@gpdf gpdf force-pushed the tickets/DM-43007 branch from b54012a to 1fe8ae4 Compare January 15, 2025 22:55
@gpdf gpdf marked this pull request as ready for review January 24, 2025 23:09
@gpdf
Copy link
Collaborator Author

gpdf commented Jan 24, 2025

@TallJimbo and @MelissaGraham
I'd like you to review this both from a DRP and a CST perspective.

Because we don't normally deploy branches to the actual TAP_SCHEMA database, you can't see this deployed on the Portal at this time, so the best place to review this is to go to the schema browser at

https://sdm-schemas.lsst.io/v/DM-43007/dp02.html#Object

and note the order of columns there, and then use @JeremyMcCormick's nice sorting feature to sort on "principal" and see the subset of columns chosen for that (principal==1 denotes a "featured" column).

The use of psfFlux for that came out of a lengthy discussion with Jim. I'm open to other options, but this seems like a good compromise. We ALSO have to get our users good documentation on the different flavors of measurements available in the Object table.

Copy link
Contributor

@MelissaGraham MelissaGraham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The selected primary columns seem reasonable. Could one of you write down the motivation of having psfFlux be primary, a little summary of your discussion maybe? I can see including an explanation like that in the documentation.

Copy link
Member

@TallJimbo TallJimbo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm posting a bunch of comments about descriptions that were not changed in the PR; they are not complete, but I realized it was going to take a long time and you might not even want those now. I'll go back and start another pass of just looking at order and principal, and if you'd like a full review of column descriptions (and existence) I can do that later, maybe on a more recent schema instead of DP0.2.

@@ -16,85 +16,117 @@ tables:
datatype: long
description: Unique id. Unique ObjectID
ivoa:ucd: meta.id;src;meta.main
tap:principal: 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not in this PR, but the description "Unique id. Unique ObjectID" above is a little repetitive.

- name: detect_isPrimary
"@id": "#Object.detect_isPrimary"
datatype: boolean
description: True if source has no children and is in the inner region of a coadd patch and is in the inner region of a coadd tract and is not a sky source
fits:tunit:
tap:principal: 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not on this PR, but the description for this important column should include the following points:

  • users should almost always select on this column to avoid duplicate interpretations/measurements of the same objects
  • selecting on this column only gets rid of those duplicates; users are still responsible for filtering on other flags to avoid bad measurement.

The description of how it's defined that's there right now is fine and accurate (aside from saying "source" instead of "object", which is not right for this variant of this column), but in this case the "why" is more important than the "what".

- name: deblend_nChild
"@id": "#Object.deblend_nChild"
datatype: int
description: Number of children this object has (defaults to 0)
fits:tunit:
tap:principal: 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not in this PR, but it's not really that this column has a "default"; it's that isolated objects and deblended children do not have any children themselves.

- name: detect_isDeblendedModelSource
"@id": "#Object.detect_isDeblendedModelSource"
datatype: boolean
description: True if source has no children and is in the inner region of a coadd patch and is in the inner region of a coadd tract and is not a sky source and is a deblended child
fits:tunit:
tap:principal: 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description for this column is completely wrong; looks like a copy of detect_isPrimary.

- name: detect_isDeblendedSource
"@id": "#Object.detect_isDeblendedSource"
datatype: boolean
description: True if source has no children and is in the inner region of a coadd patch and is in the inner region of a coadd tract and is not a sky source and is either an unblended isolated source or a deblended child from a parent with
fits:tunit:
tap:principal: 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description for this column is completely wrong; looks like a copy of detect_isPrimary.

I don't actually know what it means, though; maybe @fred3m does (note that this is from DP0.2, not main). Also strange that "source" is in the column name, not "object".

- name: x
"@id": "#Object.x"
datatype: double
description: Centroid from SDSS Centroid algorithm. Reference band.
fits:tunit: pixel
tap:principal: 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised we've got pixel-unit centroid columns in addition to the ra/dec ones. I think we should drop the pixel ones. I'm guessing they're still around because we don't seem to have ra/dec uncertainties.

- name: g_ap03Flux
"@id": "#Object.g_ap03Flux"
datatype: double
description: Flux within 3.0-pixel aperture. Forced on g-band.
fits:tunit: nJy
tap:principal: 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs for all of these fixed-size circular apertures should note that they are not aperture corrected and hence are not on the same photometric system as our other fluxes; there is some unknown offset (in magnitude space) that we do not fit for.

Oof, and we should really rename these to have sizes in arcseconds.

- name: g_calibFluxErr
"@id": "#Object.g_calibFluxErr"
datatype: double
description: Flux uncertainty within 12.0-pixel aperture. Measured on g-band.
fits:tunit: nJy
tap:principal: 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This description is accurate as of DP0.2, but would not be accurate today; we need to make sure we don't blindly copy it over.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about it would not be correct today?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've switched from a simple circular aperture to a compensated top-hat filter. This effectively does a local background subtraction, making the fluxes that go into the photometric calibration less dependent on having a good background subtraction done first.

- name: g_free_cModelFlux
"@id": "#Object.g_free_cModelFlux"
datatype: double
description: Flux from the final cmodel fit. Measured on g-band.
fits:tunit: nJy
tap:principal: 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to note somewhere that "free" here means that the models was fit full in the (in this case) g band, and the columns without "free" had only their flux fit in g and the rest held fixed from the fit in the reference band.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about swapping "free" and "cModel" in future versions of this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 to swapping "free" and "cModel".

- name: g_gaap0p5Flux
"@id": "#Object.g_gaap0p5Flux"
datatype: double
description: GaaP flux with 0.5 aperture after multiplying the seeing by 1.15. Forced on g-band.
fits:tunit: nJy
tap:principal: 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what the phrase "multiplying the seeing" means. @arunkannawadi, do you?

Mention of aperture size needs units, too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description makes it sound like a completely arbitrary kludge.

Copy link
Member

@TallJimbo TallJimbo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made a lot of line comments about order, but before we go and fix this manually I think we should revisit the schema in a broader sense - the ordering issues are largely grouping issues, and we've got grouping information at an earlier stage of the pipeline that we're just dropping on the floor right now.

- name: u_extendedness
"@id": "#Object.u_extendedness"
datatype: double
description: Set to 1 for extended sources, 0 for point sources. Measured on u-band.
fits:tunit:
tap:principal: 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think including the reference-band extendedness is sufficient for principal; I'd vote for leaving out the per-band ones.

- name: u_psfFlux_flag
"@id": "#Object.u_psfFlux_flag"
datatype: boolean
description: General Failure Flag. Forced on u-band.
fits:tunit:
tap:principal: 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general failure flags for PSF fluxes should probably be principal, since the fluxes and their uncertainties are, and those can't be used without the flag.

- name: u_pixelFlags_saturatedCenter
"@id": "#Object.u_pixelFlags_saturatedCenter"
datatype: boolean
description: Saturated pixel in the Source center. Measured on u-band.
fits:tunit:
tap:principal: 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need at least this pixel flag in the principal set. The others reflect more subtle problems (or shouldn't even exist on coadds, since they reflect problems we should have rejected from the coadds), but saturation in the center of an object really can't be ignored.

- name: deblend_skipped
"@id": "#Object.deblend_skipped"
datatype: boolean
description: Deblender skipped this source
fits:tunit:
tap:principal: 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want deblend_skipped in the principal set, too - it's not set very often, but when it is, it probably means the measurements are garbage.

- name: u_bdChi2
"@id": "#Object.u_bdChi2"
datatype: double
description: -ln(likelihood) (chi^2) in cmodel fit. Measured on u-band.
fits:tunit:
tap:principal: 0
tap:column_index: 1131
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These "bd" columns should be grouped with the CModel ones.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TallJimbo Can you make a suggestion as to exactly where to merge the bd columns into the order of the cModel ones?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think after the total CModel flux and flux uncertainty. I'd give them a cModel prefix, too.

- name: u_iDebiasedPSF_flag
"@id": "#Object.u_iDebiasedPSF_flag"
datatype: boolean
description: General failure flag, set if anything went wrong. Measured on u-band.
fits:tunit:
tap:principal: 0
tap:column_index: 1203
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This "Debiased" flag should go with the actual debiased PSF shape columns below. Same with the "Round" one.

I have no idea what the "i" prefix in these columns means. I wonder if it's a typo, or if somebody took the convention of writing I_{xx} (etc.) for image moments the wrong way.

- name: u_inputCount
"@id": "#Object.u_inputCount"
datatype: int
description: Number of images contributing at center, not including any clipping. Forced on u-band.
fits:tunit:
tap:principal: 0
tap:column_index: 1207
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd make to put inputCount up by footprintArea by the general flags.

- name: u_i_flag
"@id": "#Object.u_i_flag"
datatype: boolean
description: General failure flag, set if anything went wrong. Measured on u-band.
fits:tunit:
tap:principal: 0
tap:column_index: 1206
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure this is a flag column for u_ixx, etc., and should go right next to it (maybe going right next to it will mitigate the fact that the name is incredibly vague).

- name: u_pixelFlags_bad
"@id": "#Object.u_pixelFlags_bad"
datatype: boolean
description: Bad pixel in the Source footprint. Measured on u-band.
fits:tunit:
tap:principal: 0
tap:column_index: 1235
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pixel flags should be pulled out of the semialphabetical list of (mostly) flux measurements and put next to the detect_ and deblend_ flags.

- name: u_ixx
"@id": "#Object.u_ixx"
datatype: double
description: HSM moments. Measured on u-band.
fits:tunit: pixel**2
tap:principal: 0
tap:column_index: 1210
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each of these sets of ixx, iyy, ixy should be ordered together with their flags, i.e.

  • u_i_flag, u_ixx, u_iyy, u_ixy
  • u_iDebiased_flag, u_ixxDebiasedPSF, u_iyyDebiasedPSF, u_ixyDebiasedPSF,
    (etc)

@gpdf
Copy link
Collaborator Author

gpdf commented Jan 28, 2025

Thank you for the advice; I'll do an update to the PR with those changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants