You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Estimating resolution as 303
!w_it.cycled_list():Error:Assert failed:in file src/ccstruct/pageres.cpp, line 1502
Aborted (core dumped)
This is reproducible via the following sequence of commands (output is clipped for brevity until the end) to start a clean Ubuntu 24.04 docker container, update existing packages, install tesseract-ocr (for command line usage) and the two languages in question, tesseract-ocr-ara and tesseract-ocr-chi-tra. The test image is the same image in #4148, wget is used to download it to test. It is also available in this ticket below.
GNU gdb (Ubuntu 15.0.50.20240403-0ubuntu1) 15.0.50.20240403-git
Copyright (C) 2024 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from tesseract...
warning: could not find '.gnu_debugaltlink' file for /usr/bin/tesseract
(No debugging symbols found in tesseract)
(gdb) run sample_013741.jpg -- -l ara+chi_tra
Starting program: /usr/bin/tesseract sample_013741.jpg -- -l ara+chi_tra
warning: Error disabling address space randomization: Operation not permitted
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
warning: could not find '.gnu_debugaltlink' file for /lib/x86_64-linux-gnu/liblber.so.2
warning: could not find '.gnu_debugaltlink' file for /lib/x86_64-linux-gnu/libbrotlidec.so.1
warning: could not find '.gnu_debugaltlink' file for /lib/x86_64-linux-gnu/libbrotlicommon.so.1
Estimating resolution as 303
[New Thread 0x71f6f001f6c0 (LWP 4281)]
[New Thread 0x71f6ef81e6c0 (LWP 4282)]
[New Thread 0x71f6ef01d6c0 (LWP 4283)]
!w_it.cycled_list():Error:Assert failed:in file src/ccstruct/pageres.cpp, line 1502
Thread 1 "tesseract" received signal SIGABRT, Aborted.
__pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44
warning: 44 ./nptl/pthread_kill.c: No such file or directory
This also occurred when using the latest package from the dail-dev ppa here, version included below.
The test image is included here again for reference.
Expected Behavior
As in #4148 and #4146, the expectation is that this combination of languages and image would not cause a sigabrt.
Suggested Fix
No known suggested fixes at this time.
tesseract -v
Current Ubuntu 24.04 tesseract-ocr package:
leptonica-1.82.0
libgif 5.2.1 : libjpeg 8d (libjpeg-turbo 2.1.5) : libpng 1.6.43 : libtiff 4.5.1 : zlib 1.3 : libwebp 1.3.2 : libopenjp2 2.5.0
Found AVX2
Found AVX
Found FMA
Found SSE4.1
Found OpenMP 201511
Found libarchive 3.7.2 zlib/1.3 liblzma/5.4.5 bz2lib/1.0.8 liblz4/1.9.4 libzstd/1.5.5
Found libcurl/8.5.0 OpenSSL/3.0.13 zlib/1.3 brotli/1.1.0 zstd/1.5.5 libidn2/2.3.7 libpsl/0.21.2 (+libidn2/2.3.7) libssh/0.10.6/openssl/zlib nghttp2/1.59.0 librtmp/2.3 OpenLDAP/2.6.7
$ docker --version
Docker version 26.1.4, build 5650f9b
Other Information
I opened this new ticket even though this is closely related to #4146 and #4148 as this is entirely reproducible with the latest ubuntu packages for both the command line tesseract and the languages used. This implies that while the image may not have discernible text for the OCR process to function, it is still causing a sigabrt with a "standard configuration".
The text was updated successfully, but these errors were encountered:
In addition, this is the version of liblept5 that is installed in a clean Ubuntu 24.04 container when installing tesseract-ocr (there is no newer version of liblept5 available from the above tesseract daily-dev ppa):
Current Behavior
When running this command line:
The following occurs:
This is reproducible via the following sequence of commands (output is clipped for brevity until the end) to start a clean Ubuntu 24.04 docker container, update existing packages, install
tesseract-ocr
(for command line usage) and the two languages in question,tesseract-ocr-ara
andtesseract-ocr-chi-tra
. The test image is the same image in #4148,wget
is used to download it to test. It is also available in this ticket below.Backtrace:
This also occurred when using the latest package from the dail-dev ppa here, version included below.
The test image is included here again for reference.
Expected Behavior
As in #4148 and #4146, the expectation is that this combination of languages and image would not cause a sigabrt.
Suggested Fix
No known suggested fixes at this time.
tesseract -v
Current Ubuntu 24.04
tesseract-ocr
package:From the current latest package available from this daily-dev ppa:
Compiler
CPU
Virtualization / Containers
Other Information
I opened this new ticket even though this is closely related to #4146 and #4148 as this is entirely reproducible with the latest ubuntu packages for both the command line
tesseract
and the languages used. This implies that while the image may not have discernible text for the OCR process to function, it is still causing a sigabrt with a "standard configuration".The text was updated successfully, but these errors were encountered: