Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should Series[Any] be used internally instead of Series? #1133

Open
MarcoGorelli opened this issue Feb 28, 2025 · 1 comment
Open

Should Series[Any] be used internally instead of Series? #1133

MarcoGorelli opened this issue Feb 28, 2025 · 1 comment

Comments

@MarcoGorelli
Copy link
Member

Currently, Series is used in several places where the inner type of the Series isn't known, e.g.:

@overload
def compare(
self,
other: Series,
align_axis: AxisColumn = ...,
keep_shape: bool = ...,
keep_equal: bool = ...,
) -> DataFrame: ...

There's a couple of issues I'm running into with this

First, the pyright-strict job marks this as partially unknown:

/home/runner/work/pandas-stubs/pandas-stubs/tests/test_series.py:1039:5 - error: Type of "compare" is partially unknown
Type of "compare" is "Overload[(other: Series[Unknown], align_axis: Literal['index', 0], keep_shape: bool = ..., keep_equal: bool = ...) -> Series[Unknown], (other: Series[Unknown], align_axis: Literal['columns', 1] = ..., keep_shape: bool = ..., keep_equal: bool = ...) -> DataFrame]" (reportUnknownMemberType)

Second, when using pyright with --verifytypes to look for uncovered parts of the public API, this is flagged as "unknown type":

            {
                "category": "function",
                "name": "pandas.testing.assert_series_equal",
                "referenceCount": 1,
                "isExported": true,
                "isTypeKnown": false,
                "isTypeAmbiguous": false,
                "diagnostics": [
                    {
                        "file": "/home/marcogorelli/type_coverage_py/.pyright_env_pandas/lib/python3.12/site-packages/pandas/_testing/__init__.pyi",
                        "severity": "error",
                        "message": "Type of parameter \"left\" is partially unknown\n  Parameter type is \"Series[Unknown]\"\n    Type argument 1 for class \"Series\" has unknown type",
                        "range": {
                            "start": {
                                "line": 4,
                                "character": 27
                            },
                            "end": {
                                "line": 4,
                                "character": 46
                            }
                        }
                    },

Would it be OK to use Series[Any] instead of just Series in such cases? Or, as some libraries do, to introduce a type alias Incomplete: TypeAlias = Any to mean "we should be able to narrow down the type but for now we're not doing so" and use that in some cases

The latter use-case (--verifytypes) can, I think, really help to prioritise which stubs to add

@Dr-Irv
Copy link
Collaborator

Dr-Irv commented Feb 28, 2025

We've had discussion about this in another PR. See #1093 (comment)

Idea is to create an UnknownSeries type that would correspond to when we don't know the type. I think this may solve the problem you raise above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants