-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Raise on character encoding errors #73
Comments
Hey @burlesona, Sorry for the late response! It sure does sound like an interesting issue and might be worth solving within the gem, maybe with a flag to trigger it. Can you provide an example document that triggers the problem? Thanks, |
xijo
added a commit
that referenced
this issue
Oct 2, 2019
Hello @burlesona please have a look at the PR and let me know what you think! |
xijo
added a commit
that referenced
this issue
Oct 2, 2019
xijo
added a commit
that referenced
this issue
Oct 2, 2019
Handle force_encoding issue, according to #73
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I've been using Reverse Markdown and it works great most of the time. I've run into one issue that I thought I'd get your opinion on.
Sometimes the HTML documents I'm converting have character encoding problems, leading to th dreaded
Argument Error: invalid byte sequence in UTF-8
.In other places I'm fixing this by coercing the lines of a file to UTF8 as I read them. I've discovered that when you parse a line you can generally just
force_encoding
on it, and that will convert typographic marks and whatnot pretty well, but occasionally you'll run into issues where it's not enough and you have to be more aggressive, ie. the following:I end up using this same code to scrub HTML before I enter it into ReverseMarkdown, but it would probably be more efficient to handle it inside the gem - and would save other people from this same headache.
Are you interested in handling encoding errors inside the gem? If yes, you can use that code, or I can try to circle back with a PR. If not, no worries, just thought it might be worth considering.
Thanks for a great gem!
The text was updated successfully, but these errors were encountered: