Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add conversion between Char and Bytes/BytesView. #1495

Closed

Conversation

SyoujyoujiNaiki
Copy link
Contributor

No description provided.

@coveralls
Copy link
Collaborator

Pull Request Test Coverage Report for Build 4800

Details

  • 2 of 4 (50.0%) changed or added relevant lines in 2 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.05%) to 83.171%

Changes Missing Coverage Covered Lines Changed/Added Lines %
char/char.mbt 0 2 0.0%
Totals Coverage Status
Change from base Build 4797: 0.05%
Covered Lines: 4878
Relevant Lines: 5865

💛 - Coveralls

@peter-jerry-ye
Copy link
Collaborator

The problem with this API is: there's unlikely to be strings with UTF-32 representation (or UCS-4). We should not give people some false idea.

@SyoujyoujiNaiki
Copy link
Contributor Author

The problem with this API is: there's unlikely to be strings with UTF-32 representation (or UCS-4). We should not give people some false idea.

The conversion is treating char as Unicode point like what documents present . There is nothing to do with UTF-32.

@peter-jerry-ye
Copy link
Collaborator

peter-jerry-ye commented Jan 17, 2025

Yes, but when would you need to convert a char from/to string other than encoding it?

Also treating char as Unicode point inside a bytes is UTF-32.

@peter-jerry-ye
Copy link
Collaborator

When we use the APIs that converts data from/to Bytes representation, we are performing serialization/deserialization. The format of the numbers are usually consistent across language / protocol. The format of chars however, varies. For example, the protobuf and msgpack defines string as UTF8 strings only. Other protocols rarely uses UTF32. The internal representation of MoonBit is also UTF16.

Also, it would be better that the character / string related functions being curated inside the encoding package, which is being developed at x.

In conclusion, I think this API might be misleading, and we can alternatively add UTF32 encoding/decoding in package encoding.

@SyoujyoujiNaiki SyoujyoujiNaiki deleted the char_convertor branch January 17, 2025 06:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants