Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add utf8 support for string literal #127

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

hhugo
Copy link
Contributor

@hhugo hhugo commented Feb 25, 2023

The PR adds support for utf-8 encoded string. It fixes #90.

This only work if we assume the source is encoded in utf8.

Note that OCaml 5.3 has the following entry in its changelog.

  • #11736, #12664: Support utf-8 encoded source files and latin-9 compatible
    identifiers.

@hhugo hhugo force-pushed the utf8 branch 3 times, most recently from 3740720 to dc6f14c Compare November 1, 2024 21:44
@hhugo hhugo changed the title POC: Add utf8 support for string literal Add utf8 support for string literal Nov 1, 2024
@pmetzger
Copy link
Member

pmetzger commented Nov 3, 2024

So the documentation doesn't (yet!) officially say if we can assume source files are UTF-8, but I presume that's imminent.

@toots
Copy link
Member

toots commented Nov 4, 2024

This is a much missed feature, thanks for working on it and fingers crossed we can make it work soon.

My guess is: if you're not using this feature, we would not be requiring any assumption regarding the file's encoding? So this would be opt-in only correct?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Matching a unicode character without codepoint
3 participants