Creating a searchable pdf from the rendered notes #6

Rafail-P · 2020-10-24T19:53:20Z

Dear Robert,
I have used Onyx_render quite often on a Windows platform and I am delighted with the results. On this occasion I tried to learn a little python, trying to take a few steps towards a rather ambitious project that I wrote you about a year ago on the official Onyx forum in the discussion "Stylus writing on boox note" (user - rafail), and namely to be able to insert .svg files from other sources into an Onyx Note backup, in order to use the Onyx handwriting recognition engine that works perfectly offline.
For now, I managed to get searchable pdf files by adding a few lines to your onyx_render.py code.

I know it's a rudimentary method, but I managed to use it to look in some pretty big lists written with the Onyx Note device. Setting the transparency of the recognized word (taken from the HWRDataModel table) to 0.01 practically the written word is almost invisible.
I tried something closer to the structure of pdf files with OCR substrate, using the python minipdf module, which creates the pdf file block by block, going on the simplest structure of a pdf. To create another invisible page with the resulting OCR text behind the visible page, a parameter of the rendered text (Tr parameter) must be set to 1 - invisible.
But I stumbled when I tried to render the diacritical characters (ă, î, ș, ț, â) of the Romanian language because I had to define some partitions of some fonts that contain these characters, which is quite difficult with minipdf ...
Do you have any other ideas about making searchable pdf files?

P.S. I am a beginner in programming so please excuse my rudimentary solutions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Creating a searchable pdf from the rendered notes #6

Creating a searchable pdf from the rendered notes #6

Rafail-P commented Oct 24, 2020

Creating a searchable pdf from the rendered notes #6

Creating a searchable pdf from the rendered notes #6

Comments

Rafail-P commented Oct 24, 2020