Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to avoid Queue Message collision with Visibility Timeout vs Locking vs Dependencies #354

Open
joewinke opened this issue Dec 10, 2024 · 2 comments
Labels
question Further information is requested

Comments

@joewinke
Copy link

I am setting up a queue with a cron that runs a worker every n seconds to check if the queue.

The messages need to be processed sequentially, in order, so if there are 100 messages for a batch, the first worker may only complete 50 messages after n seconds, so a second cron will now start a second worker.

How do we prevent the 2nd worker from starting to work on the tasks the 1st is working on? I can think of using visibility timeout, creating a locking table, or setting up dependencies, but I am wondering what is the best practice or pgmq has primitives to support the best. Thanks

@brianpursley
Copy link
Contributor

Generally speaking, that is what vt is for, to make sure client 2 doesn't process the same messages as client 1. But that only prevents client 2 from reading the messages that client 1 already has read. It doesn't do anything to enforce sequential processing across multiple clients. Are you sure that you absolutely need messages to be processed sequentially?

You mention cron though, and it sounds like what you are asking for is a way to prevent another process from starting when the previously triggered process has not yet finished.

If you are using pg_cron, then by default it will not start a new job when a previous job is still running, according to the docs:

pg_cron can run multiple jobs in parallel, but it runs at most one instance of a job at a time. If a second run is supposed to start before the first one finishes, then the second run is queued and started as soon as the first run completes.

If you are using linux cron, then you have to do some work on your side to prevent multiple instances from running at the same time.

One way might be to use something like run-one in your cron job.

Another way might be to use Postgres advisory locks... lock, process, unlock.

@ChuckHend ChuckHend added the question Further information is requested label Dec 17, 2024
@joewinke
Copy link
Author

joewinke commented Jan 1, 2025

Thanks for the feedback. The way I solved this is by having queue.ts create all the messages and give them a project id, then my worker.ts reads and locks (vt>0) 1 message, gets the project id, then reads the next 100 messages with vt=0, finds the ids that are = to the project id, then reads those messages with vt>0 to lock all the messages, then the worker works through all the messages that match the project id. If the worker times out or fails, the cron will spin up another worker to finish off that project id. Does this make sense?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants