Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sanboxed workers doesn't trigger failed events when the worker was killed. #1712

Open
kennethpdev opened this issue Apr 27, 2020 · 3 comments

Comments

@kennethpdev
Copy link

kennethpdev commented Apr 27, 2020

Description

I have a worker in a separate process (sanboxed).

const worker = path.resolve(__dirname, 'worker.js');
queue.process(2, worker);

But whenever this worker terminates or force killed like a server stop/restarts the queue failed event never triggered and I always get the error below.

0|audio-worker  | /mnt/d/Git/audio-worker/node_modules/bull/lib/process/master.js:99
0|audio-worker  |   throw err;
0|audio-worker  |   ^
0|audio-worker  | Error [ERR_IPC_CHANNEL_CLOSED]: Channel closed
0|audio-worker  |     at process.target.send (internal/child_process.js:636:16)
0|audio-worker  |     at Promise.resolve.then.err (/mnt/d/Git/audio-worker/node_modules/bull/lib/process/master.js:76:21)

sometimes its Error: write EPIPE

Different from when I choose not to make it a separate process.

const worker = require('./worker');
queue.process(2, worker);

This setup always receive the failed event and no error. Is this normal, like does sandboxed never trigger the failed event or something was wrong?

Minimal, Working Test code to reproduce the issue.

My worker is simple I can replicate it with this simple reject.

module.exports = async (job) => {
  try {
    const { id: jobId } = job;
    console.log('Job worker started', jobId);

    // simulated heavy process that takes 30seconds to complete
    await new Promise((resolve, reject) => {
      const makeError = () => {
        console.log('Killing job', jobId);
        reject(new Error('Error from long process'));
      };

      process.on('SIGTERM', makeError);
      process.on('SIGINT', makeError);

      setTimeout(resolve, 30000);
    });

    // success
    return Promise.resolve();
  } catch (err) {
    console.log('worker error', err.message);
    return Promise.reject(new Error("This should trigger the failed event"));
  }
};

for this particular test I use pm2 to start the app, and then pm2 stop <name> to properly trigger the SIGINT. But I think ctrl+c also works.

To replicate, wait for a few seconds after the log Job worker started. Then terminate the process.

Bull version

v3.13.0

Additional information

node v10.20.1
ioredis v4.16.2

@robolivable
Copy link

@kenma9123 Does the SIG event trigger if your process is not long running?

I'm seeing a similar behavior, but in my case the SIGTERM event handler gets fired only after the child processor completes its work. This effectively prevents any interruption of the child process while it's working, which makes it hard to implement graceful shutdown logic.

@stale
Copy link

stale bot commented Jul 12, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Jul 12, 2021
@stale stale bot removed the wontfix label Jul 14, 2021
@phips28
Copy link

phips28 commented Sep 7, 2021

We are also using PM2 as main process manager, that starts a sandboxed processor.
When I do a pm2 restart all, the running processes wont be killed. IMO a restart should also kill the sub processes. (e.g. redeploy)

Our current code for that in the processor:

    // setup proper exit handling
    const shutdown = async () => {
      // console.log('shutdown', e);
      // ... do other tear down stuff in DB
      process.exit(410); // parent gone
    };
    process.on('SIG', shutdown);
    process.on('SIGTERM', shutdown);
    process.on('SIGINT', shutdown);

Currently SIGINT is being called when I stop PM2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants