Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using signs outside ASCII in response body #468

Open
dominikjeske opened this issue Jul 15, 2023 · 3 comments
Open

Using signs outside ASCII in response body #468

dominikjeske opened this issue Jul 15, 2023 · 3 comments
Labels
bug Indicates an unexpected problem or unintended behavior

Comments

@dominikjeske
Copy link

I'm testing pact-net and I spotted weird behavior when using signs outside ASCII table. I was using my native letters from my country Poland (like ł,ó,ń) - they are UTF-8 compatible but when I'm using them in WithJsonBody mock that is used under the cover is returning empty body. I cannot find in documentation is this by design or is it bug. I suspect some but when interop with native library is used.

@adamrodger
Copy link
Contributor

adamrodger commented Jul 16, 2023 via email

@adamrodger
Copy link
Contributor

I've tested this out locally and it breaks pretty badly with non-ASCII characters in things like consumer/provider names, interaction names, provider states, etc. It generally seems to fail everywhere because the FFI reports invalid UTF-8.

That's a bit of a problem given C# strings are UTF-16 and the P/Invoke options are just "Unicode". We may have to marshall strings manually or something.

You also get completely garbled responses back when you try to use the Unicode option to retrieve things like server logs:

image

The difference is that I get errors whereas you report that the test passes but then silently contains no data. I've not managed to reproduce that, although I could definitely see that being a possibility.

I can only apologise for that, and recommend that in the meantime you only use ASCII characters until the issue can be resolved.

@adamrodger adamrodger added the bug Indicates an unexpected problem or unintended behavior label Jul 16, 2023
adamrodger added a commit that referenced this issue Jul 16, 2023
Instead of trying to marshal strings, which don't marshal nicely over
the FFI boundary because C# uses UTF-16 but Rust wants UTF-8, instead
explicitly convert strings to a UTF-8 `byte[]` and marshal those.

Some places don't need to allow non-ASCII, such as the scheme in URLs,
whereas others are very tricky, such as consumer filters. This would
change the API to `byte[][]` and those can't be marshalled, so some
parts still support non-ASCII for now. If that's a problem in the future
then some custom marshalling could be implemented, but currently that
seems overkill.
@adamrodger
Copy link
Contributor

The linked PR fixes the issues on Windows, but is writing unreadable chars on Linux, so not sure what's happening there.

adamrodger added a commit that referenced this issue Feb 23, 2024
See: https://learn.microsoft.com/en-us/dotnet/standard/native-interop/pinvoke-source-generation

This is only supported on .Net 7+ and so older versions will still use
the old style `extern` support, which is much more difficult to use with
non-ASCII character sets when interacting with Rust via FFI. This means
that older .Net versions will still not support non-ASCII properly.

New style source generation handles marshalling strings as UTF-8
properly and efficiently so that non-ASCII characters can be used. This
fixes #468.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants