Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UTF-8 handling? #13

Open
mikebaldry opened this issue Feb 21, 2017 · 0 comments
Open

UTF-8 handling? #13

mikebaldry opened this issue Feb 21, 2017 · 0 comments

Comments

@mikebaldry
Copy link

When I try to pass a UTF-8 charlist, characters such as ł which equate to <<197, 130>> actually go in as 322 in the charlist.

iex(1)> 'hełło'             
[104, 101, 322, 322, 111]

This causes things to break (sometimes I see :erlang.iolist_size([322]) which fails because its > 255, for example), sometimes it just fails to match (depending on the current parsing context I guess)

Am I doing something wrong? (I'm assuming I am!)

I've currently got around this very very crudely by stepping through the bytes and turning it in to a normal list (so I get [197, 130] instead of [322]) then when the result comes back from apply, turn anything in the state that is a string back by stepping through the list and adding to a <<>>.

Great work on this BTW!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant