Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

connection manager stop trying to reconnect after connection failed #310

Open
elizatlawy opened this issue Nov 23, 2022 · 4 comments
Open

Comments

@elizatlawy
Copy link

We had an issue that we are currently unable to reproduce with a device that was under a low network,
The device had 10 disconnection events that were followed up by a successful reconnection,
but then on the 11 disconnect, even the device received connectFailed event and then did not try to reconnect anymore, does a connectFailed not lead to a reconnection attempt? is this is a know issue?

Version: 4.1.6
amqp version: 0.10.1

code of reconnect mechanism:

initRabbitMQConnection({ handleMessageFunction }) {
		try {
			this._handleFunction = handleMessageFunction;
			const connection = amqp.connect(RABBITMQ_CONNECTION_URI, {
				heartbeatIntervalInSeconds: 5,
				reconnectTimeInSeconds: 1
			});
			connection.on('connect', () => {
				logger.info(`Successfully connected`);
			});
			connection.on('connectFailed', ({ err }) => {
				logger.error(`Connection failed due to: ${err}`);
			});
			connection.on('disconnect', ({ err }) => {
				logger.error(`Connection disconnected due to: ${err}`);
			});
			connection.on('error', err => {
				logger.error(`Connection encountered an error: ${err}`);
			});
			connection.on('blocked', ({ reason }) => {
				logger.error(`Connection blocked due to: ${reason}`);
			});
			connection.on('unblocked', () => {
				logger.info(`Connection unblocked`);
			});
			this._channelWrapper = connection.createChannel({
				// The setup function will rerun on each reconnection
				setup: async channel => {
					// channel = the internal amqplib channel to be used for asserting/consuming.
					try {
						await channel.assertQueue(QUEUE.name, QUEUE.queueOptions);
						logger.info(`Successfully asserted queue: ${QUEUE.name}`);
						await this.bindQueueToTopics(channel);
						await this.attacheHandleFunctionToTopics(channel);
						logger.info(`Successfully finished channel setup`);
					} catch (err) {
						logger.error(`Error on Channel setup: ${err}`);
					}
				}
			});
			this._channelWrapper.on('error', (err, { name }) => {
				logger.error(`Channel: ${name} encounter an error: ${err}`);
			});
			this._channelWrapper.on('close', () => {
				logger.error(`Channel closed`);
			});
		} catch (err) {
			logger.error(`Error connecting to RabbitMQ: ${err}`);
		}
	}

Logs:

timestamp,message
1668697912766,"{""[2022-11-17 17:11:52] [INFO] [RabbitMQ]"":""Successfully connected""}"
1668697913417,"{""[2022-11-17 17:11:53] [INFO] [RabbitMQ]"":""Successfully asserted queue: Device-117""}"
1668697917795,"{""[2022-11-17 17:11:57] [INFO] [RabbitMQ]"":""Finished bound queue: Device-117 to all cloud topics""}"
1668697917987,"{""[2022-11-17 17:11:57] [INFO] [RabbitMQ]"":""Successfully attached handle function to all cloud topics""}"
1668697917988,"{""[2022-11-17 17:11:57] [INFO] [RabbitMQ]"":""Successfully finished channel setup""}"
1668698932888,"{""[2022-11-17 17:28:52] [ERROR] [RabbitMQ]"":""Connection disconnected due to: Error: Unexpected close""}"
1668698934997,"{""[2022-11-17 17:28:54] [INFO] [RabbitMQ]"":""Successfully connected""}"
1668698935562,"{""[2022-11-17 17:28:55] [INFO] [RabbitMQ]"":""Successfully asserted queue: Device-117""}"
1668698940237,"{""[2022-11-17 17:29:00] [INFO] [RabbitMQ]"":""Finished bound queue: Device-117 to all cloud topics""}"
1668698940434,"{""[2022-11-17 17:29:00] [INFO] [RabbitMQ]"":""Successfully attached handle function to all cloud topics""}"
1668698940435,"{""[2022-11-17 17:29:00] [INFO] [RabbitMQ]"":""Successfully finished channel setup""}"
1668720704114,"{""[2022-11-17 23:31:44] [ERROR] [RabbitMQ]"":""Connection disconnected due to: Error: Unexpected close""}"
1668720707560,"{""[2022-11-17 23:31:47] [INFO] [RabbitMQ]"":""Successfully connected""}"
1668720708831,"{""[2022-11-17 23:31:48] [INFO] [RabbitMQ]"":""Successfully asserted queue: Device-117""}"
1668720724196,"{""[2022-11-17 23:32:04] [INFO] [RabbitMQ]"":""Finished bound queue: Device-117 to all cloud topics""}"
1668720725747,"{""[2022-11-17 23:32:05] [INFO] [RabbitMQ]"":""Successfully attached handle function to all cloud topics""}"
1668720725748,"{""[2022-11-17 23:32:05] [INFO] [RabbitMQ]"":""Successfully finished channel setup""}"
1668724317773,"{""[2022-11-18 00:31:57] [ERROR] [RabbitMQ]"":""Connection disconnected due to: Error: Heartbeat timeout""}"
1668724330295,"{""[2022-11-18 00:32:10] [INFO] [RabbitMQ]"":""Successfully connected""}"
1668724330883,"{""[2022-11-18 00:32:10] [INFO] [RabbitMQ]"":""Successfully asserted queue: Device-117""}"
1668724335460,"{""[2022-11-18 00:32:15] [INFO] [RabbitMQ]"":""Finished bound queue: Device-117 to all cloud topics""}"
1668724335667,"{""[2022-11-18 00:32:15] [INFO] [RabbitMQ]"":""Successfully attached handle function to all cloud topics""}"
1668724335668,"{""[2022-11-18 00:32:15] [INFO] [RabbitMQ]"":""Successfully finished channel setup""}"
1668745906393,"{""[2022-11-18 06:31:46] [ERROR] [RabbitMQ]"":""Connection disconnected due to: Error: Heartbeat timeout""}"
1668745909950,"{""[2022-11-18 06:31:49] [INFO] [RabbitMQ]"":""Successfully connected""}"
1668745911166,"{""[2022-11-18 06:31:51] [INFO] [RabbitMQ]"":""Successfully asserted queue: Device-117""}"
1668745918988,"{""[2022-11-18 06:31:58] [INFO] [RabbitMQ]"":""Finished bound queue: Device-117 to all cloud topics""}"
1668745919188,"{""[2022-11-18 06:31:59] [INFO] [RabbitMQ]"":""Successfully attached handle function to all cloud topics""}"
1668745919189,"{""[2022-11-18 06:31:59] [INFO] [RabbitMQ]"":""Successfully finished channel setup""}"
1668762415727,"{""[2022-11-18 11:06:55] [ERROR] [RabbitMQ]"":""Connection disconnected due to: Error: Heartbeat timeout""}"
1668762452091,"{""[2022-11-18 11:07:32] [ERROR] [RabbitMQ]"":""Connection failed due to: Error: Socket closed abruptly during opening handshake""}"
1668762504212,"{""[2022-11-18 11:08:24] [INFO] [RabbitMQ]"":""Successfully connected""}"
1668762504794,"{""[2022-11-18 11:08:24] [INFO] [RabbitMQ]"":""Successfully asserted queue: Device-117""}"
1668762509228,"{""[2022-11-18 11:08:29] [INFO] [RabbitMQ]"":""Finished bound queue: Device-117 to all cloud topics""}"
1668762509428,"{""[2022-11-18 11:08:29] [INFO] [RabbitMQ]"":""Successfully attached handle function to all cloud topics""}"
1668762509429,"{""[2022-11-18 11:08:29] [INFO] [RabbitMQ]"":""Successfully finished channel setup""}"
1668762534214,"{""[2022-11-18 11:08:54] [ERROR] [RabbitMQ]"":""Connection disconnected due to: Error: Heartbeat timeout""}"
1668762560243,"{""[2022-11-18 11:09:20] [ERROR] [RabbitMQ]"":""Connection failed due to: Error: getaddrinfo EAI_AGAIN BROKER_URL""}"
1668762586266,"{""[2022-11-18 11:09:46] [ERROR] [RabbitMQ]"":""Connection failed due to: Error: getaddrinfo EAI_AGAIN BROKER_URL""}"
1668762607296,"{""[2022-11-18 11:10:07] [ERROR] [RabbitMQ]"":""Connection failed due to: Error: getaddrinfo EAI_AGAIN BROKER_URL""}"
1668762628316,"{""[2022-11-18 11:10:28] [ERROR] [RabbitMQ]"":""Connection failed due to: Error: getaddrinfo EAI_AGAIN BROKER_URL""}"
1668762640590,"{""[2022-11-18 11:10:40] [INFO] [RabbitMQ]"":""Successfully connected""}"
1668762641187,"{""[2022-11-18 11:10:41] [INFO] [RabbitMQ]"":""Successfully asserted queue: Device-117""}"
1668762645947,"{""[2022-11-18 11:10:45] [INFO] [RabbitMQ]"":""Finished bound queue: Device-117 to all cloud topics""}"
1668762646147,"{""[2022-11-18 11:10:46] [INFO] [RabbitMQ]"":""Successfully attached handle function to all cloud topics""}"
1668762646148,"{""[2022-11-18 11:10:46] [INFO] [RabbitMQ]"":""Successfully finished channel setup""}"
1668762675591,"{""[2022-11-18 11:11:15] [ERROR] [RabbitMQ]"":""Connection disconnected due to: Error: Heartbeat timeout""}"
1668763258182,"{""[2022-11-18 11:20:58] [ERROR] [RabbitMQ]"":""Connection failed due to: Error: read ECONNRESET""}"
1668763279204,"{""[2022-11-18 11:21:19] [ERROR] [RabbitMQ]"":""Connection failed due to: Error: getaddrinfo EAI_AGAIN BROKER_URL""}"
1668763300226,"{""[2022-11-18 11:21:40] [ERROR] [RabbitMQ]"":""Connection failed due to: Error: getaddrinfo EAI_AGAIN BROKER_URL""}"
1668763307545,"{""[2022-11-18 11:21:47] [INFO] [RabbitMQ]"":""Successfully connected""}"
1668763308142,"{""[2022-11-18 11:21:48] [INFO] [RabbitMQ]"":""Successfully asserted queue: Device-117""}"
1668763313052,"{""[2022-11-18 11:21:53] [INFO] [RabbitMQ]"":""Finished bound queue: Device-117 to all cloud topics""}"
1668763313270,"{""[2022-11-18 11:21:53] [INFO] [RabbitMQ]"":""Successfully attached handle function to all cloud topics""}"
1668763313270,"{""[2022-11-18 11:21:53] [INFO] [RabbitMQ]"":""Successfully finished channel setup""}"
1668763337549,"{""[2022-11-18 11:22:17] [ERROR] [RabbitMQ]"":""Connection disconnected due to: Error: Heartbeat timeout""}"
1668763358567,"{""[2022-11-18 11:22:38] [ERROR] [RabbitMQ]"":""Connection failed due to: Error: getaddrinfo EAI_AGAIN BROKER_URL""}"
1668763413571,"{""[2022-11-18 11:23:33] [ERROR] [RabbitMQ]"":""Connection failed due to: Error: getaddrinfo EAI_AGAIN BROKER_URL""}"
1668763430416,"{""[2022-11-18 11:23:50] [INFO] [RabbitMQ]"":""Successfully connected""}"
1668763445419,"{""[2022-11-18 11:24:05] [ERROR] [RabbitMQ]"":""Connection disconnected due to: Error: Heartbeat timeout""}"
1668763445420,"{""[2022-11-18 11:24:05] [ERROR] [RabbitMQ]"":""Channel: undefined encounter an error: Error: Channel ended, no reply will be forthcoming""}"
1668763535071,"{""[2022-11-18 11:25:35] [ERROR] [RabbitMQ]"":""Connection failed due to: Error: read ECONNRESET""}"
1668763564869,"{""[2022-11-18 11:26:04] [INFO] [RabbitMQ]"":""Successfully connected""}"
1668763576954,"{""[2022-11-18 11:26:16] [ERROR] [RabbitMQ]"":""Connection disconnected due to: Error: read ECONNRESET""}"
1668763576955,"{""[2022-11-18 11:26:16] [ERROR] [RabbitMQ]"":""Channel: undefined encounter an error: Error: Channel ended, no reply will be forthcoming""}"
1668763589067,"{""[2022-11-18 11:26:29] [INFO] [RabbitMQ]"":""Successfully connected""}"
1668763589665,"{""[2022-11-18 11:26:29] [INFO] [RabbitMQ]"":""Successfully asserted queue: Device-117""}"
1668763594464,"{""[2022-11-18 11:26:34] [INFO] [RabbitMQ]"":""Finished bound queue: Device-117 to all cloud topics""}"
1668763594664,"{""[2022-11-18 11:26:34] [INFO] [RabbitMQ]"":""Successfully attached handle function to all cloud topics""}"
1668763594665,"{""[2022-11-18 11:26:34] [INFO] [RabbitMQ]"":""Successfully finished channel setup""}"
1668763624070,"{""[2022-11-18 11:27:04] [ERROR] [RabbitMQ]"":""Connection disconnected due to: Error: Heartbeat timeout""}"
1668763645092,"{""[2022-11-18 11:27:25] [ERROR] [RabbitMQ]"":""Connection failed due to: Error: getaddrinfo EAI_AGAIN BROKER_URL""}"
@Omarosh
Copy link

Omarosh commented Mar 15, 2023

Any news about the issue?

@jwalton
Copy link
Owner

jwalton commented Apr 4, 2023

does a connectFailed not lead to a reconnection attempt?

It should keep reconnecting until you ask it to stop. If you suspect this is because of some kind of ordering-of-connects-and-disconnects problem, I'd suggesting trying to add a unit test to this project that recreates it. If you can figure out a way to recreate this, I'd be interested in that too.

@elizatlawy
Copy link
Author

does a connectFailed not lead to a reconnection attempt?

It should keep reconnecting until you ask it to stop. If you suspect this is because of some kind of ordering-of-connects-and-disconnects problem, I'd suggesting trying to add a unit test to this project that recreates it. If you can figure out a way to recreate this, I'd be interested in that too.

I will try to reproduce it with a unit test and update here if I am able to.

@p0vidl0
Copy link

p0vidl0 commented Apr 25, 2024

For a while, we've been dealing with the same problem: getting stuck when trying to reconnect on an unreliable network. It seems that by setting a timeout value in the options like this: { connectionOptions: { timeout: 10000 } }, we've managed to resolve the issue. This solution was also mentioned in #14.

From what I understand now, the default node.js net socket doesn't have a finite timeout, and the amqp.connect promise can hang in certain situations on an unstable network. It may look like the library is not trying to reconnect.

In my opinion, it's best practice to always configure this timeout. I believe it would be beneficial to emphasize this point in the library documentation.

Any way, thanks for a great library! @jwalton

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants