Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support more magic methods #582

Merged
merged 5 commits into from
Jan 24, 2025
Merged

Support more magic methods #582

merged 5 commits into from
Jan 24, 2025

Conversation

philsmt
Copy link
Contributor

@philsmt philsmt commented Dec 19, 2024

By chance I stumbled upon weird exceptions when you try to use the in operator on a DataCollection. Due to interactions in the Python data model, the absence of a custom __contains__ causes it to try treating DataCollection as a sequence...

This PR adds proper support for __contains__ in DataCollection via a custom magic method. While at it, I also added support for the OR operator for merging (both binary and infix) and excplicitly disabled direct iteration to avoid the issue above.

Open questions:

  • Should __contains__ allow train ID checks? It is somewhat weird to use, but without it we would not support everything one can pass to __getitem__
  • Any other magic methods we could add? What I deliberated and rejected:
    • __eq__: somewhat straightforward to add semantically, but probably not useful outside of tests
    • __len__: which axis to use, trains or sources/keys?
    • __iter__: Rather iterate explicitly via the provided methods, and avoid weird data model interactions

@philsmt philsmt force-pushed the feat/more-magic-methods branch from 14e1c48 to 82582b9 Compare December 19, 2024 08:29
extra_data/reader.py Outdated Show resolved Hide resolved
@takluyver takluyver added this to the 1.19 milestone Jan 23, 2025
@takluyver
Copy link
Member

Should __contains__ allow train ID checks?

Good question. I lean towards no, at least for now. It's easy enough to write tid in foo.train_ids.

Any other magic methods we could add?

Let's not for the moment. I could see an argument for making DataCollection behave more like a mapping of source names to SourceData objects, i.e. supporting iter & len over source names.

@takluyver takluyver merged commit a165940 into master Jan 24, 2025
8 checks passed
@takluyver takluyver deleted the feat/more-magic-methods branch January 24, 2025 13:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants