-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Jenkins looses connection to SQS queues #4
Comments
Hi @nickgrealy Thank you in advance, |
@ayashjorden can you replicate this on a vanilla installation? I did experience issues with long running tasks and jenkins, but that was due to faulty/interfering plugins. |
@ayashjorden The current mainline of the plugin is using relatively old AWS SDK. If @nickgrealy has the time to release the following 2-3 weeks will render some good changes, Are you using ALL functionality - SQS, SNS, and CodeCommit, or just SQS part ? I've been using SQS part solely, and the only problems it really had were things related to easier automation of configuration. |
Hello @mvk, Several times I saw the an email that corresponds to an event in SQS, but it didn't get to Jenkins. |
Hello @mvk @nickgrealy, On January 14th 2018, 05:18:14.000 UTC, the SQS plugin lost its connection:
Strange.... |
@lifeofguenter see ^^^ |
@ayashjorden just to make sure, can you maybe print the list of your plugins with:
I am very very reluctant to debug it right now :-))))) |
|
We experienced the error too. Version It stops working for a few hours and then it resumes the queue consumption without any intervention 🤔 |
I'm going to ask AWS support about long running client connections
Will post what we find when answers are provided.
Any additional questions I should add to the ticket?
Best,
Yarden
On Feb 28, 2018 1:19 AM, Yamil Asusta <[email protected]> wrote:
We experienced the error too. Version 2.0.1 of the plugin. Single job. Same error log @ayashjorden posted above.
It stops working for a few hours and then it resumes the queue consumption without any intervention 🤔
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
@ayashjorden - have you figured out what is happening? I am seeing the same issue with plugin version 2.0.1. If I go into After clicking on
|
We were seeing issues similar to this when our SQS queues were setup as FIFO Queue type. We converted all of our queues to Standard Queue type and no longer see this behavior. |
All of our queues were set up as |
With help from AWS support, I think the culprit is the JVM and DNS caching. AWS suggests setting the TTL to 60, so following this example, I've added the following to the
🤞 for 2-3 weeks |
28 days later, and our jenkins master has not lost connection to SQS. 👍 Maybe this setting should be mentioned in the docs? |
Still losing connections randomly. Going to try |
Hi @metacyclic, any update? |
Hi @ayashjorden. I haven't had an issue yet. If it creeps up, I will post back. |
@ayashjorden - the problem persists. Have had it happen a few times over the last couple of weeks. Looking at other solutions |
We migrated to polling the queues (not using this plugin), stability is much higher now.... |
What instrument do you use for polling queues from sqs? |
Hi,
First, this plugin is great! helped me build a cool automated flow for handling event.
Now, for my issue,
My Jenkins setup is composed of a master that spawns slaves ontop of DC/OS (using Marathon).
I have three jobs that are configured to trigger based on events flowing from three queues.
Today is my second time that I discovered that Jenkins stopped triggering jobs from the queues. I've verified the AWS side flow of events to be working.
The first time it happened was two months ago (IMMSR).
Looked at Jenkins logs, its a mess, didn't find anything.
Is there someone who experienced that?
If I want to setup a monitor for that, what should I look for in the logs?
This plugin is part of our critical path event handling and I would like to know that I can count on it.
I'd be happy to provide more information, just ask 👍
Yarden
The text was updated successfully, but these errors were encountered: