set workers based on recommended number of cores #1170

longshuicy · 2024-08-20T06:33:47Z

Update:
To avoid the Namespace already exists, I turned off the recreate_views flag which prevents Beanie from attempting to recreate views if they already exist. This seems to work.

I moved all the synchronous operations into a thread pool using the run_in_threadpool method.
- This approach allows the synchronous tasks to be executed in a separate thread, preventing them from blocking the main asynchronous event loop.
- You can read more about this approach here.
More importantly, I realized we were not utilizing the maximum number of workers.
- Uvicorn has built-in process management capabilities that can leverage multiple processes for better concurrency. You can read more about using workers here.
- To set the number of workers, you can use the --workers {num_of_workers} option in Uvicorn.
- Based on this recommendation, I configured 17 workers.

Expected behavior:

When downloading a large dataset, you can immediately navigate away from the page and still request other endpoints.
The backend will take some time to prepare the ZIP file.
Once the ZIP is ready and the streaming begins, the download will start automatically, regardless of which page you are on.
The browser will handle the download process

tcnichol · 2024-08-21T15:12:59Z

I tested and made sure that the download would start if I immediately left the page. It did.

On my machine the browser starts being very slow after I click download. Would it perform better if I added more workers, or is it more a limitation on chrome or firefox?

tcnichol

I tested this and it worked. I was able to navigate away after clicking 'download' on a large dataset, and the dataset did download once the zip file was created.

lmarini · 2024-08-27T19:37:18Z

Created #1189 to fix the work around.

set workers based on recommended number of cores

49dde40

longshuicy requested review from ddey2 and tcnichol August 20, 2024 06:33

longshuicy requested review from max-zilla and lmarini as code owners August 20, 2024 06:33

longshuicy linked an issue Aug 20, 2024 that may be closed by this pull request

Page is stuck if we try to download a dataset which contains big files #1119

Open

longshuicy changed the base branch from main to release/v2.0-beta-3 August 21, 2024 19:39

tcnichol approved these changes Aug 22, 2024

View reviewed changes

longshuicy added 2 commits August 26, 2024 23:46

turn recreate views to false

d261e3b

add notes to recreate views flag

e8447ee

lmarini merged commit 8c9eb8a into release/v2.0-beta-3 Aug 27, 2024
6 checks passed

lmarini deleted the large-file-stall-main-thread branch August 27, 2024 19:37

lmarini mentioned this pull request Aug 27, 2024

using aiofiles to handle IO operations async in download dataset #1145

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

set workers based on recommended number of cores #1170

set workers based on recommended number of cores #1170

longshuicy commented Aug 20, 2024 •

edited

Loading

tcnichol commented Aug 21, 2024

tcnichol left a comment

lmarini commented Aug 27, 2024

set workers based on recommended number of cores #1170

set workers based on recommended number of cores #1170

Conversation

longshuicy commented Aug 20, 2024 • edited Loading

tcnichol commented Aug 21, 2024

tcnichol left a comment

Choose a reason for hiding this comment

lmarini commented Aug 27, 2024

longshuicy commented Aug 20, 2024 •

edited

Loading