Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

" turns into " #2

Closed
raduvarga opened this issue Aug 23, 2018 · 17 comments
Closed

" turns into " #2

raduvarga opened this issue Aug 23, 2018 · 17 comments

Comments

@raduvarga
Copy link

The quote sympbol in the markdown text gets converted into ".
Is there any way to get around this?

@mpcarolin
Copy link
Owner

Hi vargaradu, thanks for pointing this out. I will look into it and get back to you.

@mpcarolin
Copy link
Owner

mpcarolin commented Aug 29, 2018

Okay so I looked into it. One of this library's dependencies, hickory, automatically escapes html entities, even in cases where it is not necessary or unwanted in hiccup. It's an open issue to allow clients to configure this: clj-commons/hickory#25

I will see what I can do in the time being. There were a few possible workarounds mentioned in the thread, so I will play around with these.

@raduvarga
Copy link
Author

Yes, suspected was a dependency who was generating the escaped html. If I have time, I'll check some of those workarounds as well.

Thanks for looking into this!

@mpcarolin
Copy link
Owner

No problem. If you come up with something and are so inclined to get your hands dirty, feel free to make a pull request :)

@raduvarga
Copy link
Author

raduvarga commented Sep 4, 2018

OK I've found a workaround (app-side with Reagent).
It's using Javascript's own mechanism of unescaping HTML.

Quite hacky, but it does the job:

(defn decode-html [html]
  (let [txt (.createElement js/document "textarea")
        _   (set! txt.innerHTML html)]
    txt.value))

(defn not-empty-html [html]
  (and html
       (not= html "<div></div>")))

(defn replaced-unescaped-html [id]
  (let [element       (js/$ (str "#" id))
        html          (.html element)]
    (when (not-empty-html html) 
          (.replaceWith element (decode-html html)))))

(defn docs-did-mount [id]
  (replaced-unescaped-html id))

(defn docs-did-update [id]
  (replaced-unescaped-html id))

(defn docs-page [filename doc-title]
  (let [id (str "docs-" filename)]
    (r/create-class
     {:component-did-mount  #(docs-did-mount id)
      :component-did-update #(docs-did-update id)
      :display-name id
      :reagent-render #(docs-render filename doc-title id)})))```

@amarjeet000
Copy link

So, I am also facing the same issue :)
I was wondering if folks have any workaround that can be used at this library's functions level.

@mpcarolin
Copy link
Owner

mpcarolin commented Nov 16, 2018

I had to put this off for awhile but I have a solution coming soon (probably in the next few days).

@mpcarolin
Copy link
Owner

mpcarolin commented Nov 19, 2018

@vargaradu
@amarjeet000

Okay sorry for the delay but it's fixed! Please try out version 5.0.1.

If for whatever reason anybody needs to keep the html encoding, pass {:encode? true} to the md->hiccup function:

(md->hiccup "#encode me!&" {:encode? true})

@amarjeet000
Copy link

@mpcarolin
Thanks so much for this fix, works great :) Sorry for the late response.

PS:
I saw this warning during compilation
WARNING: Use of undeclared Var cljs.reader/read-string at line 19 resources/public/js/compiled/out/markdown_to_hiccup/core.cljc
Doesn't create problem for me, but just thought of putting here as info.

@raduvarga
Copy link
Author

HI @mpcarolin, the update works well, but there a few characters which is still escaped in my documents:
&#39;
&ndash;

Sorry I didn't mention the other characters before.
Not sure where you could get a list with all of them, tried myself but they were always incomplete.

@mpcarolin
Copy link
Owner

@vargaradu Thanks for the info. I will patch this tomorrow with every escapable character this time.

@mpcarolin mpcarolin reopened this Nov 29, 2018
@mpcarolin
Copy link
Owner

Hey @vargaradu, can you perhaps give me some more information about what you are passing the function and what is returned? The two characters you mentioned map to the apostrophe ' symbol and the – (en dash) symbol, respectively. However I have tried these and several other HTML entities and md->hiccup works as expected:

image

In my fix for 5.0.1, I only explicitly handle the characters >, <, ", and & because those are the only characters that the Hickory library escapes, using this function here.

I am definitely willing to account for other HTML entities, but I want to zero-in on the issue first.

@mpcarolin
Copy link
Owner

mpcarolin commented Nov 29, 2018

Actually I reproduced it, but only in Clojurescript. I'll see what I can do.

@raduvarga
Copy link
Author

Yes, it is happening in Clojurescript. I think what you are looking for is here
There is escaping in multiple places such as escape-code, escaped-chars, dashes, something in make-heading.

@mpcarolin
Copy link
Owner

Good catch. The version I'm working on now uses some standard decoding utilities (goog.string/unescapeEntities for CLJS, and Apache StringEscapeUtils for CLJ). I think this might make things a little slower but at least it will cover everything.

@mpcarolin
Copy link
Owner

Version 0.6.0 just released, now using those standard libraries for decoding each string. This should also get rid of that annoying warning message you mentioned, @amarjeet000 .

image

@amarjeet000
Copy link

Working smooth now, thanks much

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants