You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
<section class="sect2" title="Figures">
<h3 class="title" id="_figures">Figures</h3>
<p>Coming up for a quick breath of fresh air before descending into another
accessibility attribute pain point, HTML5 introduces the handy new <code
class="literal">figure</code> element for encapsulating content
associated with an image, table, or code example. Grouping related content
elements together, as is becoming an old theme now, makes it simpler for a
reader to navigate and understand your content:</p>
<pre class="screen"><figure>
<img src="images/blob.jpeg" alt="the blob"/>
<figcaption>
Figure 3.7 &#x2014; The blob is digesting Steve McQueen in this
unreleased ending to the classic movie.
</figcaption>
</figure></pre>
<p>Unfortunately, there is little support for these two new elements at this
time, so they get treated as no better than <code class="literal">div</code>
elements. That said, it’s still preferable to future-proof your data and do
the right thing, as support will catch up, especially since the only other
alternative is semantically meaningless <code class="literal">div</code>
elements.</p>
</section>
The problematic is the figure section. When the method decodeEntities is called with the text node, the el.innerHTML = str instruction will actually load and run the html code that was in the figure. Unintentionally trigger http requests and other wort things.
Is it an actual bug or this side effect is expected and we should not work with live document ?
In the meantime I was trying to find a safer alternative and here what I came up with:
decodeEntities(dom: Document, str: string) {
try {
const parser = new DOMParser()
const doc = parser.parseFromString(
`<!DOCTYPE html><text>${str}</text>`,
"text/html",
)
return doc.querySelector("text")?.textContent || str
} catch (err) {
// TODO fall back to simpler decode?
// e.g. regex match for stuff like   and
return str
}
}
It does not crash and seems to be returning the correct node and CFI. I am not super confident regarding performance of using DOMParser vs the same document where we create an new temporary element
The text was updated successfully, but these errors were encountered:
Since it's not the current dom, it does not trigger html resources. However I believe we could initialize it on the constructor and destroy it once the CFI is resolved, that could increase performances.
Additionally, why is this decoding needed to begin with ? Why isn't the string just enough ?
The problematic is the figure section. When the method decodeEntities is called with the text node, the el.innerHTML = str instruction will actually load and run the html code that was in the figure. Unintentionally trigger http requests and other wort things.
Wow nice find!
Is it an actual bug or this side effect is expected and we should not work with live document ?
Yeah this is definitely a bug.
I don't have time to look at this right now but I'll try to squeeze it in some time this week. Do ping me again if I forget to get back to you. Feel free to make a pull request with your proposed solution and thanks for reporting this!
I have an epub with this content:
The problematic is the figure section. When the method
decodeEntities
is called with the text node, theel.innerHTML = str
instruction will actually load and run the html code that was in the figure. Unintentionally trigger http requests and other wort things.Is it an actual bug or this side effect is expected and we should not work with live document ?
In the meantime I was trying to find a safer alternative and here what I came up with:
It does not crash and seems to be returning the correct node and CFI. I am not super confident regarding performance of using DOMParser vs the same document where we create an new temporary element
The text was updated successfully, but these errors were encountered: