Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve type_caster for complex number types. #854

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

hpkfft
Copy link
Contributor

@hpkfft hpkfft commented Jan 12, 2025

Results for this PR were obtained with Python 3.12.8 and g++ 14.2.0.

Currently, after importing the module defined in tests/test_stl.cpp as t and defining

    class MyComplex:
        def __init__(self, real, imag):
            self.re = real
            self.im = imag
        def __complex__(self):
            return complex(self.re, self.im)

the following raises a TypeError:

    assert t.complex_value_double(MyComplex(1, 2)) == complex(1.0, 2.0)

Instead, it is expected that the __complex__ method be called and the assertion be correct.
Note that the Stable ABI for PyComplex_RealAsDouble,ImagAsDouble was fixed in version 3.13. See cpython issue 109598

Also, the current complex type_caster does not honor nb::arg().noconvert().
The following test passes

    with pytest.raises(TypeError, match="incompatible function arguments"):
        t.complex_value_double_nc(0)

but given

    class MyInt(int):
        def __new__(cls, value):
            return super().__new__(cls, value)

the following tests DID NOT RAISE <class 'TypeError'> and fail:

    with pytest.raises(TypeError, match="incompatible function arguments"):
        t.complex_value_double_nc(MyInt(7))

    with pytest.raises(TypeError, match="incompatible function arguments"):
        t.complex_value_double_nc(0.0)

    with pytest.raises(TypeError, match="incompatible function arguments"):
        t.complex_value_float_nc(1.1 + 2.0j)  # Inexact narrowing conversion

This PR re-writes the complex type_caster.

Regarding performance, given the extension

    m.def("cadd", [](std::complex<double> x, std::complex<double> y) {
                          return x + y; });

    m.def("caddf", [](std::complex<float> x, std::complex<float> y) {
                          return x + y; });

and measuring time with

python3 -m timeit -n 100000000 -r 10 \
-s "import numpy as np, my_extension as me; x = complex(1); y = complex(2)" \
"me.cadd(x, y)"

I obtained the following results:

old 312 new 312 old abi3 new abi3
cadd(complex) 27.6 ns 23.7 ns 31.0 ns 27.5 ns
caddf(complex) 28.0 ns 26.5 ns 29.2 ns 29.8 ns
cadd(np.complex128) 42.8 ns 30.0 ns 45.7 ns 44.1 ns
caddf(np.complex64) 393. ns 81.1 ns 403. ns 284. ns

For the last two lines above, note that

>>> issubclass(np.complex128, complex)
True
>>> issubclass(np.complex64, complex)
False

I would expect that compiling new abi3 with -DPy_LIMITED_API=0x030D0000 would reduce caddf(np.complex64) by 2X, but I don't have Python 3.13 handy.

@wjakob
Copy link
Owner

wjakob commented Jan 14, 2025

Like with the floating point type casters, could you move the (now fairly large) body of the converter to the static library component?

@hpkfft
Copy link
Contributor Author

hpkfft commented Jan 14, 2025

Sure. Do you prefer not to include <complex> in the static library, or is it OK to do so for a single cpp file?

@wjakob
Copy link
Owner

wjakob commented Jan 14, 2025

It's fine to include in the cpp file.

@hpkfft
Copy link
Contributor Author

hpkfft commented Jan 15, 2025

I ran the caddf(np.complex64) benchmark with Python 3.13.1

(nsec)
old 313 non-STABLE 380.
new 313 non-STABLE 82.1
old abi3 0x030C0000 384.
new abi3 0x030C0000 274.
new abi3 0x030D0000 136.

The old complex caster used the C API functions from the stable abi regardless of whether STABLE_ABI was specified.
Prior to version 3.13, these functions had a bug (or, at least, some undesirable behavior) when the python object was neither complex nor a subclass of complex. Since np.complex64 is important to nanobind users, a workaround was needed. The default from_python</*Recursive=*/true> made a temporary python object and called from_python</*Recursive=*/false> on it. It did this using the heuristic that an object with an attribute named imag should probably be converted to complex. It's not foolproof, but it works for np.complex64 .

The new complex caster uses PyComplex_AsCComplex(), which is not in the stable abi, when STABLE_ABI is not specified. This is faster, and no workaround is needed.

The new caster uses the same complex C API functions as the old one did when STABLE_ABI is specified. A workaround is applied when the object is not complex, not a subtype of complex, and has attribute __complex__. Also, the workaround was made faster. The function from_python() is no longer templated.

Also, templating is removed from from_cpp() since there seems to be no reason to distinguish rvalues from lvalues.

For non-STABLE_ABI builds, the new caster is much faster, and for STABLE_ABI builds it's faster.

As this Note mentions, the behavior of a stable abi function can change. In particular, the workaround is no longer needed in version 3.13. But, it is still perfectly correct to define Py_LIMITED_API=0x030C0000 and build with the workaround code in place.

@wjakob wjakob force-pushed the master branch 2 times, most recently from 98def19 to 92d9cb3 Compare January 27, 2025 01:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants