Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use external package betterproto-rust-codec for better (de-)serialization performance #545

Merged
merged 2 commits into from
Dec 7, 2023

Conversation

124C41p
Copy link
Contributor

@124C41p 124C41p commented Dec 4, 2023

This PR is an alternative to #520. It uses the same Rust based extension module to improve (de-)serialization performance of betterproto significantly. However, instead of integrating the extension into the betterproto project, I published it to PyPi separately, and only referenced it here as an optional dependency. For this reason, this branch is ready to use as it is.

Copy link
Collaborator

@Gobot1234 Gobot1234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed on slack, I think I'd prefer this to patch the parse function or conditionally define it whichever works better for you just because this current implementation will slow down the library for everyone

cetanu
cetanu previously approved these changes Dec 4, 2023
Copy link
Collaborator

@cetanu cetanu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks okay to me, will only be used if the optional package is installed, correct?

Copy link
Collaborator

@cetanu cetanu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this introduces two new try/except for every call to bytes, is that right?

@cetanu cetanu dismissed their stale review December 4, 2023 23:44

further question

@124C41p
Copy link
Contributor Author

124C41p commented Dec 5, 2023

I see your point. Is it better now?

Comment on lines +1871 to +1886
# monkey patch (de-)serialization functions of class `Message`
# with functions from `betterproto-rust-codec` if available
try:
import betterproto_rust_codec

def __parse_patch(self: T, data: bytes) -> T:
betterproto_rust_codec.deserialize(self, data)
return self

def __bytes_patch(self) -> bytes:
return betterproto_rust_codec.serialize(self)

Message.parse = __parse_patch
Message.__bytes__ = __bytes_patch
except ModuleNotFoundError:
pass
Copy link
Collaborator

@Gobot1234 Gobot1234 Dec 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd personally just do something along the lines of

HAS_ACCELERATOR = cast(Literal[False], importlib.util.find_spec("betterproto_rust_codec"))
if HAS_ACCELERATOR:
    import betterproto_rust_codec

class Message:
	def parse():...
	def __bytes__():...
	
	if HAS_ACCELERATOR:
		parse = betterproto_rust_codec.deserialize  # edit this so it returns self
		__bytes__ = betterproto_rust_codec.serialize

because this reduces the number of call frames and looks significantly less hacky

Copy link
Contributor Author

@124C41p 124C41p Dec 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand why, but directly assigning __bytes__ = betterproto_rust_codec.serialize does not work for some reason. So editing deserialize would probably not help either.

What do you think about this approach?

class Message:
    if HAS_ACCELERATOR:
        def parse(self, data):
            ...
    else:
        def parse(self, data):
            ...

I think this might be even more transparent.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find that problematic because it's another level of indentation which is annoying especially with black. If the function isn't binding as a descriptor just leave it as is

Copy link
Collaborator

@Gobot1234 Gobot1234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants