How to avoid Queue Message collision with Visibility Timeout vs Locking vs Dependencies #354

joewinke · 2024-12-10T20:17:20Z

I am setting up a queue with a cron that runs a worker every n seconds to check if the queue.

The messages need to be processed sequentially, in order, so if there are 100 messages for a batch, the first worker may only complete 50 messages after n seconds, so a second cron will now start a second worker.

How do we prevent the 2nd worker from starting to work on the tasks the 1st is working on? I can think of using visibility timeout, creating a locking table, or setting up dependencies, but I am wondering what is the best practice or pgmq has primitives to support the best. Thanks

brianpursley · 2024-12-16T15:27:00Z

Generally speaking, that is what vt is for, to make sure client 2 doesn't process the same messages as client 1. But that only prevents client 2 from reading the messages that client 1 already has read. It doesn't do anything to enforce sequential processing across multiple clients. Are you sure that you absolutely need messages to be processed sequentially?

You mention cron though, and it sounds like what you are asking for is a way to prevent another process from starting when the previously triggered process has not yet finished.

If you are using pg_cron, then by default it will not start a new job when a previous job is still running, according to the docs:

pg_cron can run multiple jobs in parallel, but it runs at most one instance of a job at a time. If a second run is supposed to start before the first one finishes, then the second run is queued and started as soon as the first run completes.

If you are using linux cron, then you have to do some work on your side to prevent multiple instances from running at the same time.

One way might be to use something like run-one in your cron job.

Another way might be to use Postgres advisory locks... lock, process, unlock.

joewinke · 2025-01-01T14:17:46Z

Thanks for the feedback. The way I solved this is by having queue.ts create all the messages and give them a project id, then my worker.ts reads and locks (vt>0) 1 message, gets the project id, then reads the next 100 messages with vt=0, finds the ids that are = to the project id, then reads those messages with vt>0 to lock all the messages, then the worker works through all the messages that match the project id. If the worker times out or fails, the cron will spin up another worker to finish off that project id. Does this make sense?

ChuckHend added the question Further information is requested label Dec 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to avoid Queue Message collision with Visibility Timeout vs Locking vs Dependencies #354

How to avoid Queue Message collision with Visibility Timeout vs Locking vs Dependencies #354

joewinke commented Dec 10, 2024

brianpursley commented Dec 16, 2024

joewinke commented Jan 1, 2025

How to avoid Queue Message collision with Visibility Timeout vs Locking vs Dependencies #354

How to avoid Queue Message collision with Visibility Timeout vs Locking vs Dependencies #354

Comments

joewinke commented Dec 10, 2024

brianpursley commented Dec 16, 2024

joewinke commented Jan 1, 2025