Support combined cluster count and cluster weak lensing #2

NiallMac · 2018-02-09T18:22:14Z

We want to extend the library/file format to be applicable to combined cluster count and cluster weak lensing. The cluster weak lensing signal is a 2-point function, the SpectrumMeasurement class should work, possibly with some small extensions if we are using DeltaSigma(R) rather than gamma_t(theta)). There are a few extra considerations:

i) Boost factor - a value per tangential shear datapoint. The boost factors have a covariance matrix. Is there covariance between the boost factors and the raw measurements?

ii) Clusters are split into richness bins as well as redshift bins. This does not necessarily require generalizing the SpectrumMeasurement class. We could either

a. Have a separate SpectrumMeasurement instance (i.e. separate extensions at the file level) per richness bin (each containing all the z bin combinations for that richness bin)...

b. Hold all (richness, redshift) bins in one SpectrumMeasurement instance (i.e. the same extension at the file level) e.g. ordered (R_i=richness bin i, zc_i=cluster redshift bin j, zs_k=source redshift bin k):
R_1, zc_1, zs_1
R_1, zc_1, zs_2
...
R_1, zc_2, zs_1
R_1, zc_2, zs_2
...
R_2, zc_1, zs_1
R_2, zc_1, zs_2
etc.
The 'bin1' index would then be R_i * (# of cluster z bins) + zc_j (and the bin2 index would just be the source redshift bin index k as usual).
For this solution, we would probably still want to generalize (make a child class) of SpectrumMeasurement which can translate between bin1 and R_i, zc_j.

iii) We want to include count information. This is just a number (per area?) per richness, cluster z bin. It also needs an accompanying row,column in the covariance matrix.

danielgruen · 2018-02-09T19:23:13Z

That's great! Some comments ... 2018-02-09 10:22 GMT-08:00 NiallMac <[email protected]>:

We want to extend the library/file format to be applicable to combined cluster count and cluster weak lensing. The cluster weak lensing signal is a 2-point function, the SpectrumMeasurement class should work, possibly with some small extensions if we are using DeltaSigma(R) rather than gamma_t(theta)). There are a few extra considerations:

I think you can rely on us using gamma_t(theta) for a fixed source redshift distribution per source redshift bin (not different ones for each radial/richness/lens redshift bins etc.)

i) Boost factor - a value per tangential shear datapoint. The boost factors have a covariance matrix. Is there covariance between the boost factors and the raw measurements?

I don't think there is a cross-covariance. Note that Eduardo, Tamas and I are still discussing whether we need the boost factor data points or can do the correction at the data vector level. I am thinking the former, but maybe I am thinking too complicated.

ii) Clusters are split into richness bins as well as redshift bins. This does not necessarily require generalizing the SpectrumMeasurement class. We could either a. Have a separate SpectrumMeasurement instance (i.e. separate extensions at the file level) per richness bin (each containing all the z bin combinations for that richness bin)... b. Hold all (richness, redshift) bins in one SpectrumMeasurement instance (i.e. the same extension at the file level) e.g. ordered (R_i=richness bin i, zc_i=cluster redshift bin j, zs_k=source redshift bin k): R_1, zc_1, zs_1 R_1, zc_1, zs_2 ... R_1, zc_2, zs_1 R_1, zc_2, zs_2 ... R_2, zc_1, zs_1 R_2, zc_1, zs_2 etc. The 'bin1' index would then be R_i * (# of cluster z bins) + zc_j (and the bin2 index would just be the source redshift bin index k as usual). For this solution, we would probably still want to generalize (make a child class) of SpectrumMeasurement which can translate between bin1 and R_i, zc_j.

Option b sounds better to me, and also maps well to the cosmolike way of handling data.

iii) We want to include count information. This is just a number (per area?) per richness, cluster z bin. It also needs an accompanying row,column in the covariance matrix.

I guess it's really the raw count at the data vector level. Where is the meta data about the other data vectors that's required for things like covariances (source density, survey area, etc.)? The information on area should be handled the same way.

…

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#2>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADmWkrsKc7205iN3r9wAKNPcK_z02be0ks5tTIzWgaJpZM4SAVMi> .

NiallMac · 2018-02-10T16:10:46Z

@danielgruen Do we need to save n(Lambda) (as in the distribution, not just the total) for each redshift/richness bin? I would have thought this would be required to predict P(M) for that bin e.g. P(M) = \int dlambda P(M | lambda) n(lambda) or something...

NiallMac · 2018-02-10T20:10:46Z

Apparently full n(Lambda) distribution isn't necessary - next question is how to include the redshift selection function information. The redshift selection function for a given cluster bin is just a top-hat in cluster photo-z. However, information from the catalogs (and I think this file should contain all information required from the catalogs) is required to generate P(z_true | photo-z) for the likelihood calculation. How should we store this information? Couple of options:

i) For each cluster bin, store arrays of finely spaced z, mean(sigma_z) and std(sigma_z) (where sigma_z is the RedMapper reported photo-z error, and the mean/std is over all clusters in some finely spaced z bin.

ii) Matteo uses a polynomial fit to sigma_z(z) for each lambda bin - we could store an object that can store the polynomial coefficients.

Both of these options could be accompanied by a function that returns sigma_z(z, cluster bin). One advantage of (ii) is that it will be faster for repeated evaluations e.g. in integrals. But of course one could also fit a polynomial to the arrays stored in option (i).

Thoughts?

danielgruen · 2018-02-15T02:52:32Z

Yeah, that sounds reasonable. I guess the numbers make the ordering of the polynomial coefficients pretty unambiguous ... c[0]*z**N+c[1]*z**(N-1)+c[N-1] 2018-02-10 12:10 GMT-08:00 NiallMac <[email protected]>:

…

Apparently this isn't necessary - next question is how to include the redshift selection function information. The redshift selection function for a given cluster bin is just a top-hat in cluster photo-z. However, information from the catalogs (and I think this file should contain all information required from the catalogs) is required to generate P(z_true | photo-z) for the likelihood calculation. How should we store this information? Couple of options: i) For each cluster bin, store arrays of finely spaced z, mean(sigma_z) and std(sigma_z) (where sigma_z is the RedMapper reported photo-z error, and the mean/std is over all clusters in some finely spaced z bin. ii) Matteo uses a polynomial fit to sigma_z(z) for each lambda bin - we could store an object that can store the polynomial coefficients. Both of these options could be accompanied by a function that returns sigma_z(z, cluster bin). One advantage of (ii) is that it will be faster for repeated evaluations e.g. in integrals. But of course one could also fit a polynomial to the arrays stored in option (i). Thoughts? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADmWkgUMSgEJM2tMvrWhGrojHej5AKJnks5tTffGgaJpZM4SAVMi> .

NiallMac · 2018-02-15T03:08:28Z

Yeh, that is what I went with in the end. There's a get_sigma_z function that at least removes ambiguity for users of an already made file. But yeh the ordering on input is fairly ambiguous, I guess this just needs to be documented this very clearly...

danielgruen · 2018-02-15T03:13:13Z

FWIW, I've updated the mock-ish Y1 files to include a mock-ish polynomial sigma_z at https://www.slac.stanford.edu/~dgruen/lighthouse/ 2018-02-14 19:08 GMT-08:00 NiallMac <[email protected]>:

…

Yeh, that is what I went with in the end. There's a get_sigma_z function that at least removes ambiguity for users of an already made file. Hmm...but yeh the ordering on input is fairly ambiguous, I guess this just needs to be documented this very clearly... — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADmWkj6UQD_5ueqO_d2oVXynX0-fwsZMks5tU5-sgaJpZM4SAVMi> .

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support combined cluster count and cluster weak lensing #2

Support combined cluster count and cluster weak lensing #2

NiallMac commented Feb 9, 2018

danielgruen commented Feb 9, 2018 via email

NiallMac commented Feb 10, 2018

NiallMac commented Feb 10, 2018 •

edited

Loading

danielgruen commented Feb 15, 2018 via email

NiallMac commented Feb 15, 2018 •

edited

Loading

danielgruen commented Feb 15, 2018 via email

Support combined cluster count and cluster weak lensing #2

Support combined cluster count and cluster weak lensing #2

Comments

NiallMac commented Feb 9, 2018

danielgruen commented Feb 9, 2018 via email

NiallMac commented Feb 10, 2018

NiallMac commented Feb 10, 2018 • edited Loading

danielgruen commented Feb 15, 2018 via email

NiallMac commented Feb 15, 2018 • edited Loading

danielgruen commented Feb 15, 2018 via email

NiallMac commented Feb 10, 2018 •

edited

Loading

NiallMac commented Feb 15, 2018 •

edited

Loading