-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement tiling of canvas into smaller pieces #6419
Comments
(We might also consider to tile JPEG images when we read operator list) |
@yurydelendik Can you help me get started on this? |
@prometheansacrifice first thing will be to modify API (and canvas.js) at https://github.com/mozilla/pdf.js/blob/master/src/display/api.js#L834 to have e.g. |
TODOs:
|
Is there any progress around this issue or is it not on the future planning anymore? |
@Liestambeur it's still in plans but not on a schedule -- other projects is in progress atm. A contributors who wish to help to advance it, may find us at IRC channel. |
I would just like to add that with today's wide desktop displays, I find interactive page-less PDFs a great way for a high-quality 1:1 web presentation of e.g. Latex documents, where a plain zoom works well enough and text flowing is only breaking typography rules or a careful placement of floats. Of course, it won't replace more interactivity, animations or text flow when it is actually needed, but may have its niche. Yet, when I generated a PDF of 1x A4 width and 25 x A4 height and rendered it with PDFSinglePageViewer, probably one of these limits here discussed have been hit, despite the rendering having only about 100 dpi. The canvas size was about 800 x 20000 and produced a blocky, unusable rendering. I would say that some kind of tiling might extend pdf.js applications beyond these of a viewer of printable PDFs. If anyone is interested, I attach a test PDF of a similar aspect ratio to that discussed. |
I would like to offer a bounty of 2000 USD for this feature This feature is becoming more important to our company. I see that there were 16 issues raised that are related to this one. I was wondering if this deserves more attention? Hoping that this bonus will help get this resolved for everyone interested. |
Any plan about this issue? This limitation makes pdf.js unavailable in many scenarios, for example, zooming a pdf file, or opening a PDF exported by Safari. Also it affects lots of applications depend on pdf.js, such as Logseq. I think this issue should get a higher priority :) |
I would really appreciate a solution to this as I enjoy looking at track maps of railway and subway systems. These are generally large images intended to be viewed at high zoom levels. For example, see the track map of the NY Subway below, which is |
A naive solution would be to simply set the transform of the render function to render the page to a smaller canvas. However, the performance is quite bad, as you can imagine. The render time increases linearly with the number of tiles. Lines 42 to 55 in e67bf68
|
Even if performance is not ideal, it seems still better than the current status. WDYT @calixteman @Snuffleupagus @timvandermeij? |
Not really, since as already mentioned in #6419 (comment) this will affect performance quite badly in many cases: "The render time increases linearly with the number of tiles." It might not look so bad in the demo above, but that's probably because that particular PDF document isn't all that "complex". Please consider the case where a page (currently) takes 2 seconds to render: If that's split into 10 sub-canvases, that same page now takes 20 seconds to finish rendering! In order for this to work we'd need a way for the |
Other than potentially wasted CPU time, what is the downside if we use the current CSS zoom solution and replace it when the rendering is done? Isn't it still a net improvement? |
Here's the PDF for future reference NYC_full_trackmap.pdf. |
This commit is a first step towards mozilla#6419, and it can also help with mozilla#13287. To support rendering _part_ of a page, we will need to first compute which ops can affect what is visible in that part of the page. This commit adds logic to track "group of ops" with their respective bounding boxes. Each group eather corresponds to a single op or to a range, and it can have dependencies earlier in the ops list that are not contiguous to the range. Consider the following example: ``` 0. setFillRGBColor 1. beginText 2. showText "Hello" 3. endText 4. constructPath [...] 5. eoFill ``` here we have two groups: the text (range 1-3) and the path (range 4-5). Each of them has a corresponding bounding box, and a dependency on the op at index 0. This tracking happens when first rendering a PDF: we wrap the canvas with a "canvas recorder" that has the same API, but with additional methods to mark the start/end of a group.
This commit is a first step towards mozilla#6419, and it can also help with mozilla#13287. To support rendering _part_ of a page, we will need to first compute which ops can affect what is visible in that part of the page. This commit adds logic to track "group of ops" with their respective bounding boxes. Each group eather corresponds to a single op or to a range, and it can have dependencies earlier in the ops list that are not contiguous to the range. Consider the following example: ``` 0. setFillRGBColor 1. beginText 2. showText "Hello" 3. endText 4. constructPath [...] 5. eoFill ``` here we have two groups: the text (range 1-3) and the path (range 4-5). Each of them has a corresponding bounding box, and a dependency on the op at index 0. This tracking happens when first rendering a PDF: we wrap the canvas with a "canvas recorder" that has the same API, but with additional methods to mark the start/end of a group.
This commit is a first step towards mozilla#6419, and it can also help with mozilla#13287. To support rendering _part_ of a page, we will need to first compute which ops can affect what is visible in that part of the page. This commit adds logic to track "group of ops" with their respective bounding boxes. Each group eather corresponds to a single op or to a range, and it can have dependencies earlier in the ops list that are not contiguous to the range. Consider the following example: ``` 0. setFillRGBColor 1. beginText 2. showText "Hello" 3. endText 4. constructPath [...] 5. eoFill ``` here we have two groups: the text (range 1-3) and the path (range 4-5). Each of them has a corresponding bounding box, and a dependency on the op at index 0. This tracking happens when first rendering a PDF: we wrap the canvas with a "canvas recorder" that has the same API, but with additional methods to mark the start/end of a group.
One thing to note is that the time taken to draw outside the canvas is significantly lower than drawing inside the canvas. This measurement is based on drawing 1,000,000 Bézier curves. Therefore, re-issuing the drawing command for every tile might not be as bad as it seems. Here are some benchmark results:
|
I noticed the same in #19128, where rendering the tile is much faster than rendering the whole. For a partial render, the JavaScript code significantly dominates the time spent drawing. |
This commit is a first step towards mozilla#6419, and it can also help with first compute which ops can affect what is visible in that part of the page. This commit adds logic to track "group of ops" with their respective bounding boxes. Each group eather corresponds to a single op or to a range, and it can have dependencies earlier in the ops list that are not contiguous to the range. Consider the following example: ``` 0. setFillRGBColor 1. beginText 2. showText "Hello" 3. endText 4. constructPath [...] 5. eoFill ``` here we have two groups: the text (range 1-3) and the path (range 4-5). Each of them has a corresponding bounding box, and a dependency on the op at index 0. This tracking happens when first rendering a PDF: we wrap the canvas with a "canvas recorder" that has the same API, but with additional methods to mark the start/end of a group.
This will likely be required to fully fix https://bugzilla.mozilla.org/show_bug.cgi?id=1936605. |
The problem is that large canvases take much memory space. It's visible if a PDF page is large (e.g. map) or zoomed in (e.g. at 800%+ zoom). Currently we are limiting canvas size (#4834) for mobile device. However a proper solution will be to divide page into smaller canvases and render only visible parts.
It's mostly blocked by generating operator list based on crop area (useful for zooming heavy maps), but we can proceed without it, and try to render the same operator list on several canvases.
The text was updated successfully, but these errors were encountered: