Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Select refactor #1523

Merged
merged 27 commits into from
Feb 9, 2025
Merged

Select refactor #1523

merged 27 commits into from
Feb 9, 2025

Conversation

kartographer
Copy link
Contributor

@kartographer kartographer commented Jan 27, 2025

Description

The select method of UVData, UVCal, UVFlag, and UVBeam have been refactored, and two new additional keywords added:

  • invert: Allows you to specify which pieces of the object to discard rather than preserve.
  • strict: Allows you to adjust the error level [Exception / Warning / None] when supplying arguments that don't completely match object parameters (e.g., you give 3 times but only 2 match in time_array).

As part of this, several of the selection helper functions inside of utils have been refactored (and in some cases, new helper functions added), with several common pieces of code moved to utils.tools.

Additionally, a new method has been added to UVBase called _select_along_param_axis, which migrated code that used to be found in e.g., UVData._select_by_index, which allows users to perform a selection on UVParamters based on their expected form and a provided index array.

Finally, several warnings related to compatibility to FITS-formatted file type have been removed, specifically those related to length of elements in extra_keywords, as well as frequency and time spacing.

Motivation and Context

Well, this was definitely a case of learning to be wary of pulling particular threads...

The original motivation for this was aimed at providing an easy way to deselect entries (i.e., invert selection, a la #988), since this would be desirable for at least some SMA use cases. Attempting to add that was a little difficult with several of the select-helper functions, so I decide to take a heavy pass through them and try to clean them up. To add that, it made sense to modify the default behavior of select to not throw an error if some-but-not-all selection criteria matched supplied parameters (I was originally only going to enable this for the invert selection case, but upon rediscovering #1401 I made it work for the general select case). While I was working on this, I also discovered a lot of common code in UVData._select_by_index and UVCal._select_by_index, and so I moved that over to a new function called UVBase._select_along_param_axis, which should also be accessible to UVFlag and UVBeam (and be reasonably well-optimized).

Additionally, since i was already tinkering in the code, I have removed several warnings that were popping up when select was called related to time/frequency spacing and incompatibility with FITS-formatted file types. These warnings have caused a fair amount of confusion with SMA users, who are not writing to any of these types. Moreover, some of these warnings have been spurious, in that no user actions have caused this incompatibility, but rather the underlying data set itself is intrinsically incompatible with these file types. As it falls in the same vein, I've also taken the opportunity to address #1519, to additionally remove lots of extra noisy warnings.

Two notes:

  • I'm marking this as draft for now until I can touch base on the change of behavior here with select. The amount of additional work needed is relatively small (particularly as some test coverage is already provided, though how many additional test cases are needed could use some discussion). (all work is complete, tests added)

  • I'm marking this as a "breaking change", in that it is a change in behavior, although the new behaviors were things that would previously error or otherwise require a keyword to be set. Note that hera_cal is currently failing because it's trying to catch a particular error and a different one is being thrown (right type, wrong text). (this is now fixed)

Closes #988
Closes #1401
Closes #1519

Types of changes

  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

New feature checklist:

  • I have added or updated the docstrings associated with my feature using the numpy docstring format.
  • I have updated the tutorial to highlight my new feature (if appropriate).
  • I have added tests to cover my new feature.
  • All new and existing tests pass.
  • I have updated the CHANGELOG.

Breaking change checklist:

  • I have updated the docstrings associated with my change using the numpy docstring format.
  • I have updated the tutorial to reflect my changes (if appropriate).
  • My change includes backwards compatibility and deprecation warnings (if possible).
  • I have added tests to cover my changes.
  • All new and existing tests pass.
  • I have updated the CHANGELOG.

Copy link

codecov bot commented Jan 27, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.93%. Comparing base (8363bf5) to head (a3e80fb).
Report is 27 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1523      +/-   ##
==========================================
- Coverage   99.93%   99.93%   -0.01%     
==========================================
  Files          63       63              
  Lines       21852    21746     -106     
==========================================
- Hits        21837    21731     -106     
  Misses         15       15              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@bhazelton bhazelton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow this is a lot of work! And it's fantastic, thank you! We should discuss in a telecon, but I love the approach.

I have just a few minor comments at this time. Thanks for catching a bunch of typos, my issues with spelling the plural of index is very clear 🤦‍♀️

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 2.

Benchmark suite Current: 92e9281 Previous: bca6cf0 Ratio
tests/utils/test_bls.py::test_bls_to_ant[min=0-len=1] 43395.14251787367 iter/sec (stddev: 0.0005591594001530778) 552981.6637057393 iter/sec (stddev: 0.0000021971044770532586) 12.74
tests/utils/test_bls.py::test_bls_to_ant[min=0-len=10] 123115.81383686775 iter/sec (stddev: 0.00020614044164823207) 535002.9574421683 iter/sec (stddev: 0.000004298097951139465) 4.35
tests/utils/test_bls.py::test_bls_to_ant[min=0-len=100] 125590.61732300112 iter/sec (stddev: 0.00006995575791353252) 417123.3205690947 iter/sec (stddev: 0.000008347855964930154) 3.32
tests/utils/test_bls.py::test_bls_to_ant[min=0-len=1000] 99731.57959016616 iter/sec (stddev: 0.000016816638553159388) 300258.59301592375 iter/sec (stddev: 0.00001001486660738103) 3.01
tests/utils/test_bls.py::test_bls_to_ant[min=0-len=10000] 24757.976887614717 iter/sec (stddev: 0.0003446919977400511) 56478.132904768885 iter/sec (stddev: 0.000012545332681308013) 2.28
tests/utils/test_bls.py::test_bls_to_ant[min=65536-len=1000] 138032.4088284642 iter/sec (stddev: 0.0003579659482395008) 285064.83970754384 iter/sec (stddev: 0.00001167646588819394) 2.07

This comment was automatically generated by workflow using github-action-benchmark.

@bhazelton
Copy link
Member

I made a PR on SSINS that seems to fix the problem we're seeing here: mwilensky768/SSINS#127

@bhazelton
Copy link
Member

I also made a PR on hera_cal to fix the issue in that test: HERA-Team/hera_cal#988

Copy link
Member

@bhazelton bhazelton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking really good, just a few minor comments.

Copy link
Member

@bhazelton bhazelton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thanks @kartographer! This is a lot of work and a big improvement!

@bhazelton bhazelton merged commit dd8b192 into main Feb 9, 2025
46 checks passed
@bhazelton bhazelton deleted the select_refactor branch February 9, 2025 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants