Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(upload-files): handle server error, document how to circumvent resource leak (DEV-2528, DEV-2527) #464

Merged
merged 5 commits into from
Aug 11, 2023

Conversation

jnussbaum
Copy link
Collaborator

No description provided.

@jnussbaum jnussbaum self-assigned this Aug 11, 2023
@linear
Copy link

linear bot commented Aug 11, 2023

DEV-2528 upload-files: handle server error

If a file cannot be uploaded due to a server error, the user has no possibility to retry. There is currently no way to upload a failed file to the server at a later point of time.

DSP-TOOLS must handle this, and retry, or provide a mechanism that allows the user to upload the failed files at a later point of time.

See attached an error message

error-msg.txt

Copy link
Contributor

@Vijeinath Vijeinath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jnussbaum jnussbaum changed the title fix(upload-files): handle server error (DEV-2528) fix(upload-files): handle server error, document how to circumvent resource leak (DEV-2528, DEV-2527) Aug 11, 2023
@linear
Copy link

linear bot commented Aug 11, 2023

DEV-2527 Vij's Mac Pro becomes very slow after 6 days of upload-files

Problem description

Vij's super powerful Mac Pro (28 double core) becomes very slow and unresponsive after 6 days of uploading files to the dev-02 server with dsp-tools upload-files.

(This command calls SIPI's /upload_without_processing route, in parallel. The code is here: https://github.com/dasch-swiss/dsp-tools/blob/main/src/dsp_tools/fast_xmlupload/upload_files.py#L325).

Datadog shows that the dev-02 server is okay.

DSP-TOOLS' log files show that there are breaks of serveral hours where DSP-TOOLS does nothing:

2023-08-09 23:13:21,054 upload_files.py      INFO     Found the following upload candidates for /Volumes/Thunderbay-1/tmp-real-sgv/85/43/85438b37-c240-490a-9b7f-11885ac9f09c.jp2: 
 - /Volumes/Thunderbay-1/tmp-real-sgv/85/43/85438b37-c240-490a-9b7f-11885ac9f09c.info
 - /Volumes/Thunderbay-1/tmp-real-sgv/85/43/85438b37-c240-490a-9b7f-11885ac9f09c.jp2
 - /Volumes/Thunderbay-1/tmp-real-sgv/85/43/85438b37-c240-490a-9b7f-11885ac9f09c.tif.orig
2023-08-09 23:13:21,055 upload_files.py      INFO     Upload candidates for /Volumes/Thunderbay-1/tmp-real-sgv/85/43/85438b37-c240-490a-9b7f-11885ac9f09c.jp2 are okay.
2023-08-09 23:13:21,101 upload_files.py      INFO     Successfully uploaded file /Volumes/Thunderbay-1/tmp-real-sgv/85/43/85438b37-c240-490a-9b7f-11885ac9f09c.info
2023-08-09 23:13:23,970 upload_files.py      INFO     Successfully uploaded file /Volumes/Thunderbay-1/tmp-real-sgv/85/43/85438b37-c240-490a-9b7f-11885ac9f09c.jp2
2023-08-09 23:13:27,906 upload_files.py      INFO     Successfully uploaded file /Volumes/Thunderbay-1/tmp-real-sgv/85/43/85438b37-c240-490a-9b7f-11885ac9f09c.tif.orig
2023-08-09 23:13:27,924 upload_files.py      INFO     Successfully uploaded all files for /Volumes/Thunderbay-1/tmp-real-sgv/85/43/85438b37-c240-490a-9b7f-11885ac9f09c.jp2.
2023-08-10 03:15:48,732 upload_files.py      INFO     Found the following upload candidates for /Volumes/Thunderbay-1/tmp-real-sgv/38/23/382349aa-5a8e-45bb-a13e-2d958eafd0be.jp2: 
 - /Volumes/Thunderbay-1/tmp-real-sgv/38/23/382349aa-5a8e-45bb-a13e-2d958eafd0be.info
 - /Volumes/Thunderbay-1/tmp-real-sgv/38/23/382349aa-5a8e-45bb-a13e-2d958eafd0be.tif.orig
 - /Volumes/Thunderbay-1/tmp-real-sgv/38/23/382349aa-5a8e-45bb-a13e-2d958eafd0be.jp2
2023-08-10 03:15:48,734 upload_files.py      INFO     Upload candidates for /Volumes/Thunderbay-1/tmp-real-sgv/38/23/382349aa-5a8e-45bb-a13e-2d958eafd0be.jp2 are okay.
2023-08-10 03:15:51,121 upload_files.py      INFO     Successfully uploaded file /Volumes/Thunderbay-1/tmp-real-sgv/38/23/382349aa-5a8e-45bb-a13e-2d958eafd0be.info
2023-08-10 03:15:57,714 upload_files.py      INFO     Successfully uploaded file /Volumes/Thunderbay-1/tmp-real-sgv/38/23/382349aa-5a8e-45bb-a13e-2d958eafd0be.tif.orig
2023-08-10 03:16:00,474 upload_files.py      INFO     Successfully uploaded file /Volumes/Thunderbay-1/tmp-real-sgv/38/23/382349aa-5a8e-45bb-a13e-2d958eafd0be.jp2
2023-08-10 03:16:00,475 upload_files.py      INFO     Successfully uploaded all files for /Volumes/Thunderbay-1/tmp-real-sgv/38/23/382349aa-5a8e-45bb-a13e-2d958eafd0be.jp2.
2023-08-10 07:33:14,757 upload_files.py      INFO     Found the following upload candidates for /Volumes/Thunderbay-1/tmp-real-sgv/36/34/363448d7-7160-42f8-bddf-b714497de1f9.jp2: 
 - /Volumes/Thunderbay-1/tmp-real-sgv/36/34/363448d7-7160-42f8-bddf-b714497de1f9.info
 - /Volumes/Thunderbay-1/tmp-real-sgv/36/34/363448d7-7160-42f8-bddf-b714497de1f9.jp2
 - /Volumes/Thunderbay-1/tmp-real-sgv/36/34/363448d7-7160-42f8-bddf-b714497de1f9.tif.orig
2023-08-10 07:33:14,759 upload_files.py      INFO     Upload candidates for /Volumes/Thunderbay-1/tmp-real-sgv/36/34/363448d7-7160-42f8-bddf-b714497de1f9.jp2 are okay.
2023-08-10 07:33:17,135 upload_files.py      INFO     Successfully uploaded file /Volumes/Thunderbay-1/tmp-real-sgv/36/34/363448d7-7160-42f8-bddf-b714497de1f9.info
2023-08-10 07:33:23,433 upload_files.py      INFO     Successfully uploaded file /Volumes/Thunderbay-1/tmp-real-sgv/36/34/363448d7-7160-42f8-bddf-b714497de1f9.jp2
2023-08-10 07:33:32,106 upload_files.py      INFO     Successfully uploaded file /Volumes/Thunderbay-1/tmp-real-sgv/36/34/363448d7-7160-42f8-bddf-b714497de1f9.tif.orig
2023-08-10 07:33:32,128 upload_files.py      INFO     Successfully uploaded all files for /Volumes/Thunderbay-1/tmp-real-sgv/36/34/363448d7-7160-42f8-bddf-b714497de1f9.jp2.

Mac's Activity Monitor shows that the Python process consumes normal CPU, but 4 GB of Real Memory Size and 40 GB of Virtual Memory Size.

The PyCharm process, from where the dsp-tools upload-files command was started, has normal values.

But there was a process Virtual Machine Service (from Docker) that used ca. 16'000 threads, even though Docker is not involved in dsp-tools upload-files. Killing all Docker processes didn't help anything.

Analysis

There must be a mysterious resource leak that we cannot find.

Solution

Instead of uploading all files at once, it would be better to upload them batch-wise. The previous step (dsp-tools process-files) does the same, and produces several pickle files with its result. So instead of combining all pickle files and upload all its content together: Upload only the files contained in 1 pickle file, then quit Python.

That means that the command dsp-tools upload-files must be called several times (similar to the process-files command).

logging.log

Screenshots.zip

@jnussbaum jnussbaum merged commit 7aa7106 into main Aug 11, 2023
4 checks passed
@jnussbaum jnussbaum deleted the wip/dev-2528-handle-server-error branch August 11, 2023 10:40
@daschbot daschbot mentioned this pull request Aug 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants