Integrate harfbuzz for text shaping in fpdf #696
Replies: 6 comments 6 replies
-
This is thai text testing for fpdf2 harfbuzz. It pretty shaping but it can't render some text (random). |
Beta Was this translation helpful? Give feedback.
-
Thank you for your very promising work on this @andersonhc! |
Beta Was this translation helpful? Give feedback.
-
Am I understanding this correctly, in that it currently only produces single lines of text? It looks like we are ending up with redundant font data already. We'll have to study the output of harfbuzz in more details in order to figure out how a full integration could possibly look like. |
Beta Was this translation helpful? Give feedback.
-
This sounds really great!
Yes, I see how deep this asumption is nested inside
Thanks! I had a look and it's very promising!
buf = hb.Buffer()
... # perform text shaping based on font select & text provided
for info, pos in zip(buf.glyph_infos, buf.glyph_positions):
char = chr(self.current_font.subset.pick_by_id(info.codepoint))
# then use char & pos to insert glyph inside PDF stream A couple of questions on the code of this method:
I checked the
Overall it looks like a great candidate and I'm fine to introduce it as an dependency of We already have dependencies that require compiled C code:
IF harfbuzz introduction does not have an impact on performances, I'd be in favor of using it by default. When I started answering here, I initially suggested to make it an optional/peer dependency (like we do already for What do you think about this plan @andersonhc & @gmischler? Also, it seems like |
Beta Was this translation helpful? Give feedback.
-
This assumption is currently only present in the
Ligature processing takes two or more chars as input, and produces one or more glyphs as output.
Given its functionality, it essentially has to provide a superset of fonttools functionality (minus the SVG paths).
Good plan! 😁 @andersonhc , if you need more details about the inner workings of |
Beta Was this translation helpful? Give feedback.
-
Hi @andersonhc! Are you still working on this promising integration? 😊 |
Beta Was this translation helpful? Give feedback.
-
I am trying to integrate Harfbuzz (https://en.wikipedia.org/wiki/HarfBuzz) with fpdf as a possible solution to our text shaping problems (diacritics, ligatures, kerning, left-to-right vs right-to-left, etc.)
The harfbuzz project is open-source, MIT licensed and actively developed (https://github.com/harfbuzz/harfbuzz) and they have the uharfbuzz package available on PIP with the Cython bindings (https://github.com/harfbuzz/uharfbuzz) so it's straightforward to use it.
My first step was building a proof-of-concept version of fpdf with harfbuzz. I ran into several problems because fpdf is built over a "one character = one glyph" concept that is not compatible with properly shaped text.
The proof-of-concept version I built is here: https://github.com/andersonhc/fpdf2/tree/harfbuzz
This version if far from being production ready. I added a harfbuzz-text() function that works like text() but uses the shaper. I am attaching some files comparing text with and without harfbuzz taken from some of our open issues.
There's a lot of questions we need to discuss before moving forward. Some of them are:
I'd love to hear @Lucas-C and @gmischler input.
Those are the tests I did. The code I used to generate the files was:
Testing with Fira Code (plenty of ligatures):
test-firacode.pdf
Testing hindi text as reported on #365
test-365-hindi.pdf
Testing hebrew from #549
test-549-hebrew.pdf
test-hebrew-549-2.pdf
Testing Tibetan from #679
test-679-tibetan.pdf
Testing arabic right-to-left text
test-arabic-rtl.pdf
Beta Was this translation helpful? Give feedback.
All reactions