Replies: 2 comments 1 reply
-
I am encountering an issue with invoices from a specific firm. While processing 4,000 documents in my system, I didn’t experience this problem with any other documents. However, invoices from this particular firm consistently display the same random characters, as previously described. |
Beta Was this translation helpful? Give feedback.
-
This has the feeling of a bad atttempt at an obfuscation layer à la ROT13 within the PDF - though it's not "shift by 13" in this case. Every symbol is replaced by the same other symbol in the entire document.
The A is shifted less because they might need the logical replacement "!" for something else. All numbers must have been shifted into the realm of control characters (unprintable) or filtered out entirely. |
Beta Was this translation helpful? Give feedback.
-
I have a couple documents (PDF's) that already have readable text. When i put them into paperless (it uses OCRmyPDF) and the OCR finishes it is not readable anymore.
This is a screenshot of part of the Original pdf.
and this is the screenshot of the same part but of the archive file:
my OCR is set to german and english and worked fine till now. all the other documents are processed correctly. just those 4 files look like this.
Is this a setting I have wrong or is the file itself protected?
Beta Was this translation helpful? Give feedback.
All reactions