-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JP-3456 - Replace use of scipy.signal.medfilt
in outlier detection
#8033
Merged
Merged
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
import warnings | ||
|
||
import pytest | ||
import numpy as np | ||
import scipy.signal | ||
|
||
from jwst.outlier_detection.outlier_detection_ifu import medfilt | ||
|
||
|
||
|
||
@pytest.mark.parametrize("shape,kern_size", [ | ||
([7, 7], [3, 3]), | ||
([7, 7], [3, 1]), | ||
([7, 7], [1, 3]), | ||
([7, 5], [3, 3]), | ||
([5, 7], [3, 3]), | ||
([42, 42], [7, 7]), | ||
([42, 42], [7, 5]), | ||
([42, 42], [5, 7]), | ||
([42, 7, 5], [3, 3, 3]), | ||
([5, 7, 42], [5, 5, 5]), | ||
]) | ||
def test_medfilt_against_scipy(shape, kern_size): | ||
arr = np.arange(np.prod(shape), dtype='uint32').reshape(shape) | ||
result = medfilt(arr, kern_size) | ||
expected = scipy.signal.medfilt(arr, kern_size) | ||
np.testing.assert_allclose(result, expected) | ||
|
||
|
||
@pytest.mark.parametrize("arr,kern_size,expected", [ | ||
([2, np.nan, 0], [3], [1, 1, 0]), | ||
([np.nan, np.nan, np.nan], [3], [0, np.nan, 0]), | ||
]) | ||
def test_medfilt_nan(arr, kern_size, expected): | ||
with warnings.catch_warnings(): | ||
warnings.filterwarnings( | ||
"ignore", | ||
message="All-NaN slice", | ||
category=RuntimeWarning | ||
) | ||
result = medfilt(arr, kern_size) | ||
np.testing.assert_allclose(result, expected) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was required for the minimum version (0.19) of scikit-image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks ok to me. From our slack messages I assume you have run the outlier detection regression tests and they now show you no difference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking a look.
I should clarify, this PR does cause 12 regtest failures. The failures are from 2 parameterized tests:
These are the same tests that fail when scipy is updated, or when run on a mac and can be explained by the undefined (and unreliable) output of
scipy.signal.medfilt
when provided anp.nan
input (as occurs in the failing tests). To use the examples in the unit test added with this PR.The expected output using a median filter (and filling edges with 0s as is done by default for
scipy.signal.medfilt
) should be[1, 1, 0]
.If I run the following on my mac, with scipy 1.9.3 I get a different and incorrect output:
with scipy 1.11.3 I get a different (and also incorrect):
Other examples show differences for linux vs mac.
As the truth files were generated using
scipy.signal.medfilt
and the input containsnan
the truth file contents depend on the undefined behavior of the version of scipy used to generate the file. The truth files for these tests will need to be updated if the output is acceptable (and this PR is merged).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To further clarify, with this PR the new output (which differs from the truth files) is identical on mac and linux for the
_crf
and_x1d
files. The_s3d
files show differences (< 1e-5) on mac and linux (independent of this PR other tests that generate_s3d
files show differences between mac and linux on the main branch as seen in the jenkins devdeps tests so this looks to be a separate issue, one not addressed in this PR).For the
_crf
files, the differences are only for pixels with anan
within the 7x7 window used for the median filter. These are largely found on the edges of usable pixels. Taking the first example in the diff for filejw01249005001_03101_00001_nrs1_o005_crf.fits
. The fitsdiff output reports:indicating that the pixel at (using numpy indices) [93, 1715] is being flagged as an outlier in the result file but not in the truth file. Inspecting the
minarr
calculated during the outlier detection step:jwst/jwst/outlier_detection/outlier_detection_ifu.py
Lines 220 to 226 in 2199ba8
This pixel is on the edge of a group of usable pixels, inspecting the computed
normarr
(the output ofmedfilt
) for this PR we see:Which differs substantially from the
normarr
computed without this PR (usingscipy.signal.medfilt
):Inspecting the value of
normarr
used in the further outlier detection for the pixel in question gives 2 very different results for this PR vs main:As the
normarr
values are later used to compute the ratio of the minarr value relative to the local median this pixel with this PR is ~66 / 1.4 and flagged as an outlier vs ~66 / 19.8 and not flagged as an outlier. As this 'flag' is used to mark bad pixels in all 8 input files this explains the 8_crf
test failures.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jemorrison do these changes look reasonable to you?