-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JP-3456 - Replace use of scipy.signal.medfilt
in outlier detection
#8033
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #8033 +/- ##
=======================================
Coverage 75.93% 75.93%
=======================================
Files 459 459
Lines 37608 37612 +4
=======================================
+ Hits 28557 28561 +4
Misses 9051 9051
*This pull request uses carry forward flags. Click here to find out more. ☔ View full report in Codecov by Sentry. |
0e38876
to
b72f23b
Compare
0640c7a
to
90d5975
Compare
scipy.signal.medfilt
in outlier detection
90d5975
to
6b6e40c
Compare
b297f87
to
7f33db3
Compare
@@ -53,7 +55,7 @@ install_requires = | |||
tweakwcs>=0.8.3 | |||
asdf-astropy>=0.3.0 | |||
wiimatch>=0.2.1 | |||
packaging>19.0 | |||
packaging>20.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was required for the minimum version (0.19) of scikit-image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks ok to me. From our slack messages I assume you have run the outlier detection regression tests and they now show you no difference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking a look.
I should clarify, this PR does cause 12 regtest failures. The failures are from 2 parameterized tests:
These are the same tests that fail when scipy is updated, or when run on a mac and can be explained by the undefined (and unreliable) output of scipy.signal.medfilt
when provided a np.nan
input (as occurs in the failing tests). To use the examples in the unit test added with this PR.
arr = [2, np.nan, 0]
ksize = [3]
The expected output using a median filter (and filling edges with 0s as is done by default for scipy.signal.medfilt
) should be [1, 1, 0]
.
If I run the following on my mac, with scipy 1.9.3 I get a different and incorrect output:
>>> scipy.signal.medfilt([2, np.nan, 0], [3])
array([nan, nan, 0.])
with scipy 1.11.3 I get a different (and also incorrect):
>>> scipy.signal.medfilt([2, np.nan, 0], [3])
array([0., 0., 0.])
Other examples show differences for linux vs mac.
As the truth files were generated using scipy.signal.medfilt
and the input contains nan
the truth file contents depend on the undefined behavior of the version of scipy used to generate the file. The truth files for these tests will need to be updated if the output is acceptable (and this PR is merged).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To further clarify, with this PR the new output (which differs from the truth files) is identical on mac and linux for the _crf
and _x1d
files. The _s3d
files show differences (< 1e-5) on mac and linux (independent of this PR other tests that generate _s3d
files show differences between mac and linux on the main branch as seen in the jenkins devdeps tests so this looks to be a separate issue, one not addressed in this PR).
For the _crf
files, the differences are only for pixels with a nan
within the 7x7 window used for the median filter. These are largely found on the edges of usable pixels. Taking the first example in the diff for file jw01249005001_03101_00001_nrs1_o005_crf.fits
. The fitsdiff output reports:
Extension HDU 1 (SCI, 1):
Data contains differences:
Data differs at [1716, 94]:
a> nan
b> -103.99343
...
Data contains differences:
Data differs at [1716, 94]:
a> 33554449
? ^^
b> 33554432
indicating that the pixel at (using numpy indices) [93, 1715] is being flagged as an outlier in the result file but not in the truth file. Inspecting the minarr
calculated during the outlier detection step:
jwst/jwst/outlier_detection/outlier_detection_ifu.py
Lines 220 to 226 in 2199ba8
minarr = np.nanmin(diffarr, axis=0) | |
# Normalise the differences to a local median image to deal with ultra-bright sources | |
normarr = medfilt(minarr, kernel_size=kern_size) | |
nfloor = np.nanmedian(minarr)/3 | |
normarr[normarr < nfloor] = nfloor # Ensure we never divide by a tiny number | |
minarr_norm = minarr / normarr |
>> minarr[93, 1715]
66.545156955719
>> minarr[90:97, 1712:1719]
array([[ nan, nan, nan, nan, nan,
nan, nan],
[ nan, nan, nan, nan, nan,
nan, nan],
[ nan, nan, nan, nan, nan,
nan, nan],
[29.89870262, 4.97431731, 19.80752754, 66.54515696, 27.27770805,
17.14254284, 2.87409228],
[ 0.38395762, 2.3855598 , 4.7994982 , 8.4161818 , 1.58876133,
3.21722126, 1.05671763],
[ 0.14560556, 1.19486213, 0.89728248, 0.40425563, 1.58876133,
1.25159645, 0.71704447],
[ 0.14560556, 1.19486213, 0.89728248, 0.40425563, 2.14912176,
0.49365234, 0.51363105]])
This pixel is on the edge of a group of usable pixels, inspecting the computed normarr
(the output of medfilt
) for this PR we see:
>> normarr[90:97, 1712:1719]
array([[19.80752754, 27.27770805, 27.27770805, 19.80752754, 17.14254284,
19.80752754, 17.14254284],
[ 6.30404165, 7.02648562, 6.69524956, 4.88690776, 4.00835973,
4.62222749, 4.19920981],
[ 3.89869487, 3.89869487, 2.3855598 , 2.3855598 , 2.3855598 ,
2.78641891, 2.78641891],
[ 1.19486213, 1.58876133, 1.42017889, 1.42017889, 1.42017889,
1.58876133, 1.58876133],
[ 1.0041163 , 1.58876133, 1.25159645, 1.25159645, 1.25159645,
1.58876133, 1.51533907],
[ 0.9249387 , 1.30883139, 1.22322929, 1.22322929, 1.19486213,
1.33719856, 1.46906986],
[ 0.91056663, 1.42280066, 1.19486213, 1.25159645, 1.19486213,
1.25159645, 1.42280066]])
Which differs substantially from the normarr
computed without this PR (using scipy.signal.medfilt
):
>> scipy.signal.medfilt(minarr)[90:97, 1712:1719]
array([[ nan, nan, nan, nan, nan,
nan, nan],
[ nan, nan, nan, nan, nan,
nan, nan],
[19.80752754, 27.27770805, 27.27770805, 19.80752754, nan,
19.80752754, 17.14254284],
[19.80752754, 8.83398247, 27.27770805, 19.80752754, 17.14254284,
27.01134586, nan],
[ 3.89869487, 2.41227081, 2.41227081, 2.41227081, 2.2597177 ,
2.78641891, 2.2597177 ],
[ 1.19486213, 1.62179971, 1.58876133, 1.58876133, 1.58876133,
1.62179971, 1.62179971],
[ 0.91056663, 1.42280066, 1.19486213, 1.25159645, 1.19486213,
1.25159645, 1.42280066]])
Inspecting the value of normarr
used in the further outlier detection for the pixel in question gives 2 very different results for this PR vs main:
>> scipy.signal.medfilt(minarr)[93, 1715]
19.807527542114258
>> normarr[93, 1715]
1.4201788902282715
As the normarr
values are later used to compute the ratio of the minarr value relative to the local median this pixel with this PR is ~66 / 1.4 and flagged as an outlier vs ~66 / 19.8 and not flagged as an outlier. As this 'flag' is used to mark bad pixels in all 8 input files this explains the 8 _crf
test failures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jemorrison do these changes look reasonable to you?
scipy.signal.medfilt
in outlier detectionscipy.signal.medfilt
in outlier detection
Good question - I think the README updates should stick around until the most recent build allows installation on 3.12, i.e. we would need a release including this update (with relaxed pinning) before the README edits are no longer necessary. From what I understand, most installations are from pip and not from git clones of the repo. |
@braingram Which batch of regtests were run after the last code updates to the PR? Just want to make sure I see differences that are relevant to the current code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall. The only change I'd like to see is keeping the upper limit on python, for now.
This is the most recent run (~20 days old). Given the changes on I expect the 12 tests that failed in the above run will fail for the re-run (for the reasons described above). A new run is starting at: https://plwishmaster.stsci.edu:8081/job/RT/job/JWST-Developers-Pull-Requests/1056/ |
d1b41c6
to
651acf4
Compare
@hbushouse the new run finished with 13 errors. The 12 expected above and 1 new one:
|
In the most recent regtest run a lot of the differences are minor changes to pixel values in SCI and ERR arrays, which is expected. But all the crf products for one test now show NaN values for SCI pixels that previously had a number. Shouldn't this change go the other direction, i.e. changing previous NaN values to a number? Or is it now properly trapping cases where it can't compute a median and setting them to NaN? |
Are the differences that you're asking about similar to the following:
If so, I will attempt to summarize the description here: #8033 (comment) If this isn't the change you're asking about please let me know and I can take another look at the results. |
yes, those are the kinds of changes I was wondering about. Thanks for the explanation. So in that case I think it looks good. I have no idea what's causing the new association-related failure. I'm assuming a fluke in the test run. I haven't seen it in other test runs today. |
Thanks for the review! |
Resolves JP-3456
Closes #8050
While testing removal of the scipy upper pin it was found that
scipy.signal.medfilt
(used during outlier detection) has undefined behavior fornan
inputs: scipy/scipy#4800Examining regression test warnings we can see that
nan
values are making their way into themedfilt
input:jwst/jwst/outlier_detection/outlier_detection_ifu.py
Lines 219 to 223 in 7f1c5ca
This is causing a number of issues including:
This PR removes the use of
scipy.signal.medfilt
and replaces it with code that usesnp.nanmedian
andskimage.util.view_as_windws
(and addsscikit-image
as a dependency).Limited local testing suggests that the new approach is slightly slower for the median calculation (~20%).
The truth files for a few regression tests will need to be updated to match the changes in this PR.
As this PR addresses the issue that required the scipy upper pin, this PR:
re-adds python 3.12 testing (which was held up by the scipy pin)See: JP-3456 - Replace use ofscipy.signal.medfilt
in outlier detection #8033 (comment)adds python 3.12 support (there are additional deprecation warnings that will need to be addressed in follow-up PRs, see: Address python 3.12 deprecations #8046)See: JP-3456 - Replace use ofscipy.signal.medfilt
in outlier detection #8033 (comment)Regression tests run at: https://plwishmaster.stsci.edu:8081/blue/organizations/jenkins/RT%2FJWST-Developers-Pull-Requests/detail/JWST-Developers-Pull-Requests/1035/pipeline/192
The regression tests results show the expected failures in:
test_spec3_multi
test_spec3_multi_emsm
as these tests involve using
medfilt
with inputs that containnan
values.A spot check with a single test run on my local mac machine matched the differences in the above regtest results.
https://plwishmaster.stsci.edu:8081/blue/organizations/jenkins/RT%2FJWST-Developers-Pull-Requests/detail/JWST-Developers-Pull-Requests/1036/pipeline/207 ran these tests on both mac and linux to see if the differences are consistent. There are some minor numerical differences (10^-8) for
test_spec3_multi_emsm
between the mac and linx runs and a small difference in the number of reported pixels with differences (973963 vs 974002).