You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I download 64-bit Windows version from here, how described in official Tesseract wiki → in installation process I select Russian (rus) additional language → I install Tesseract → I add path with tesseract.exe as user PATH environment variable → I run command:
tesseract KiraProcessedTIF.tif KiraSuperhero -l rus pdf
5. Expected behavior
For KiraCorrectOCR text select correctly in any program:
6. Actual behavior
For KiraSuperhero Tesseract select not full word:
It reproduced for any word in KiraSuperhero.
7. Not helped
I reproduce actual behavior for KiraSuperhero in any PDF viewer.
Firefox:
PDF-XChange Editor:
8. Environment
Windows 10 Enterprise LTSB 64-bit EN
D:\SashaDebugging\KiraGoddess>tesseract --version
tesseract v5.0.0-alpha.20190708
leptonica-1.78.0
libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.3) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0
Found AVX2
Found AVX
Found SSE
Found libarchive 3.3.2 zlib/1.2.11 liblzma/5.2.3 bz2lib/1.0.6 liblz4/1.7.5
Thanks.
The text was updated successfully, but these errors were encountered:
1. Possibly related issue
#1712.
2. Summary
PDF viewers incorrect select words from PDF, that create by Tesseract.
3. Data
Example files from my book:
KiraProcessedTIF.tif
— TIF imageKiraSuperhero.pdf
— PDF, that create TesseractKiraCorrectOCR.pdf
— PDF with correct OCR for comparing4. Steps to rperoduce
I download 64-bit Windows version from here, how described in official Tesseract wiki → in installation process I select Russian (
rus
) additional language → I install Tesseract → I add path withtesseract.exe
as userPATH
environment variable → I run command:5. Expected behavior
For
KiraCorrectOCR
text select correctly in any program:6. Actual behavior
For
KiraSuperhero
Tesseract select not full word:It reproduced for any word in
KiraSuperhero
.7. Not helped
I reproduce actual behavior for
KiraSuperhero
in any PDF viewer.8. Environment
Thanks.
The text was updated successfully, but these errors were encountered: