Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Cleanup Scans / OCR error (ref #1329) #1926

Open
1 task done
marcofenoglio opened this issue Sep 17, 2024 · 3 comments
Open
1 task done

[Bug]: Cleanup Scans / OCR error (ref #1329) #1926

marcofenoglio opened this issue Sep 17, 2024 · 3 comments

Comments

@marcofenoglio
Copy link

marcofenoglio commented Sep 17, 2024

Installation Method

Docker

The Problem

In reference to #1329

I tried with the new version 0.29.0 and I still get the same error:
Output file location (/tmp/output_16848536940110331072.pdf) is not a writable file.

Version of Stirling-PDF

0.29.0

Last Working Version of Stirling-PDF

No response

Page Where the Problem Occurred

httls://mydomain/pdf/ocr-pdf

Docker Configuration

docker run -d \
  --name stirling-pdf \
  -p 9284:8080 \
  -e SYSTEM_ROOTURIPATH=/pdf \
  -e DOCKER_ENABLE_SECURITY=false \
  -e INSTALL_BOOK_AND_ADVANCED_HTML_OPS=false \
  -e LANGS=it_IT,en_GB \
  -v /mnt/data/docker_data/stirling-pdf/trainingData:/usr/share/tessdata \
  -v /mnt/data/docker_data/stirling-pdf/extraConfigs:/configs \
  -v /mnt/data/docker_data/stirling-pdf/logs:/logs \
  --restart unless-stopped \
  frooodle/s-pdf:0.29.0

Relevant Log Output

12:06:39.569 [qtp2050525584-35] INFO  s.s.SPDF.utils.ProcessExecutor - Running command: ocrmypdf --verbose 2 --output-type pdf --pdf-renderer hocr --deskew --skip-text --language eng /tmp/input_12592559430707309562.pdf /tmp/output_16848536940110331072.pdf
12:06:40.226 [Thread-23] INFO  s.s.SPDF.utils.ProcessExecutor -   DEBUG ocrmypdf - ocrmypdf 16.1.1
12:06:40.226 [Thread-23] INFO  s.s.SPDF.utils.ProcessExecutor -   DEBUG ocrmypdf.subprocess - Running: ['tesseract', '--version']
12:06:40.234 [Thread-23] INFO  s.s.SPDF.utils.ProcessExecutor -   DEBUG ocrmypdf.subprocess - Found tesseract 5.3.4
12:06:40.235 [Thread-23] INFO  s.s.SPDF.utils.ProcessExecutor -   DEBUG ocrmypdf.subprocess - Running: ['tesseract', '--version']
12:06:40.243 [Thread-23] INFO  s.s.SPDF.utils.ProcessExecutor -   DEBUG ocrmypdf.subprocess - Running: ['gs', '--version']
12:06:40.258 [Thread-23] INFO  s.s.SPDF.utils.ProcessExecutor -   DEBUG ocrmypdf.subprocess - Found gs 10.3.1
12:06:40.258 [Thread-23] INFO  s.s.SPDF.utils.ProcessExecutor -   DEBUG ocrmypdf.subprocess - Running: ['gs', '--version']
12:06:40.274 [Thread-23] INFO  s.s.SPDF.utils.ProcessExecutor -   DEBUG ocrmypdf.subprocess - Running: ['tesseract', '--list-langs']
12:06:40.286 [Thread-23] INFO  s.s.SPDF.utils.ProcessExecutor -   DEBUG ocrmypdf.subprocess.tesseract - stdout/stderr = [DS] Profile read from file (tesseract_opencl_profile_devices.dat).
12:06:40.287 [Thread-23] INFO  s.s.SPDF.utils.ProcessExecutor - [DS] Device[1] 0:(null) score is 0.390008
12:06:40.287 [Thread-23] INFO  s.s.SPDF.utils.ProcessExecutor - [DS] Selected Device[1]: "(null)" (Native)
12:06:40.288 [Thread-23] INFO  s.s.SPDF.utils.ProcessExecutor - List of available languages in "/usr/share/tessdata/" (1):
12:06:40.288 [Thread-23] INFO  s.s.SPDF.utils.ProcessExecutor - eng
12:06:40.289 [Thread-23] INFO  s.s.SPDF.utils.ProcessExecutor -
12:06:40.289 [Thread-23] INFO  s.s.SPDF.utils.ProcessExecutor -   DEBUG ocrmypdf.helpers - pikepdf mmap enabled
12:06:40.291 [Thread-23] INFO  s.s.SPDF.utils.ProcessExecutor -   ERROR ocrmypdf._pipelines._common - ExitCodeException
12:06:40.292 [Thread-23] INFO  s.s.SPDF.utils.ProcessExecutor - Traceback (most recent call last):
12:06:40.292 [Thread-23] INFO  s.s.SPDF.utils.ProcessExecutor -   File "/usr/lib/python3.12/site-packages/ocrmypdf/_pipelines/_common.py", line 249, in cli_exception_handler
12:06:40.293 [Thread-23] INFO  s.s.SPDF.utils.ProcessExecutor -     return fn(options, plugin_manager)
12:06:40.294 [Thread-23] INFO  s.s.SPDF.utils.ProcessExecutor -            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
12:06:40.294 [Thread-23] INFO  s.s.SPDF.utils.ProcessExecutor -   File "/usr/lib/python3.12/site-packages/ocrmypdf/_pipelines/ocr.py", line 166, in _run_pipeline
12:06:40.295 [Thread-23] INFO  s.s.SPDF.utils.ProcessExecutor -     check_requested_output_file(options)
12:06:40.295 [Thread-23] INFO  s.s.SPDF.utils.ProcessExecutor -   File "/usr/lib/python3.12/site-packages/ocrmypdf/_validation.py", line 310, in check_requested_output_file
12:06:40.295 [Thread-23] INFO  s.s.SPDF.utils.ProcessExecutor -     raise OutputFileAccessError(
12:06:40.296 [Thread-23] INFO  s.s.SPDF.utils.ProcessExecutor - ocrmypdf.exceptions.OutputFileAccessError: Output file location (/tmp/output_16848536940110331072.pdf) is not a writable file.
12:06:40.419 [qtp2050525584-35] WARN  o.e.j.ee10.servlet.ServletChannel - handleException /pdf/api/v1/misc/ocr-pdf java.io.IOException: Command process failed with exit code 5. Error message:   DEBUG ocrmypdf - ocrmypdf 16.1.1
  DEBUG ocrmypdf.subprocess - Running: ['tesseract', '--version']
  DEBUG ocrmypdf.subprocess - Found tesseract 5.3.4
  DEBUG ocrmypdf.subprocess - Running: ['tesseract', '--version']
  DEBUG ocrmypdf.subprocess - Running: ['gs', '--version']
  DEBUG ocrmypdf.subprocess - Found gs 10.3.1
  DEBUG ocrmypdf.subprocess - Running: ['gs', '--version']
  DEBUG ocrmypdf.subprocess - Running: ['tesseract', '--list-langs']
  DEBUG ocrmypdf.subprocess.tesseract - stdout/stderr = [DS] Profile read from file (tesseract_opencl_profile_devices.dat).
[DS] Device[1] 0:(null) score is 0.390008
[DS] Selected Device[1]: "(null)" (Native)
List of available languages in "/usr/share/tessdata/" (1):
eng

  DEBUG ocrmypdf.helpers - pikepdf mmap enabled
  ERROR ocrmypdf._pipelines._common - ExitCodeException
Traceback (most recent call last):
  File "/usr/lib/python3.12/site-packages/ocrmypdf/_pipelines/_common.py", line 249, in cli_exception_handler
    return fn(options, plugin_manager)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/site-packages/ocrmypdf/_pipelines/ocr.py", line 166, in _run_pipeline
    check_requested_output_file(options)
  File "/usr/lib/python3.12/site-packages/ocrmypdf/_validation.py", line 310, in check_requested_output_file
    raise OutputFileAccessError(
ocrmypdf.exceptions.OutputFileAccessError: Output file location (/tmp/output_16848536940110331072.pdf) is not a writable file.

Additional Information

No response

Browsers Affected

No response

No Duplicate of the Issue

  • I have verified that there are no existing issues raised related to my problem.
@Frooodle
Copy link
Member

Your ticket says
"Installation Method

None"

Without this we can't help or debug

@marcofenoglio
Copy link
Author

I updated the ticket! I use Docker

@marcofenoglio
Copy link
Author

Running the contianer without the default seccomp profile, using --security-opt seccomp=unconfined, I have no more errors.
How can I inspect which is the syscall causing the problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants