Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apache ActiveMQ Artemis: Peer disconnected #26

Open
themerius opened this issue Oct 1, 2015 · 19 comments
Open

Apache ActiveMQ Artemis: Peer disconnected #26

themerius opened this issue Oct 1, 2015 · 19 comments

Comments

@themerius
Copy link

Hi there,
I've tested this library with Apache ActiveMQ Artemis 1.1.0, and it works.
But if my process idles some time, it gets this exception:

background log: error: java.io.EOFException: Peer disconnected
background log: error:  at org.fusesource.hawtdispatch.transport.AbstractProtocolCodec.read(AbstractProtocolCodec.java:331)
background log: error:  at org.fusesource.hawtdispatch.transport.TcpTransport.drainInbound(TcpTransport.java:706)
background log: error:  at org.fusesource.hawtdispatch.transport.TcpTransport$6.run(TcpTransport.java:588)
background log: error:  at org.fusesource.hawtdispatch.internal.NioDispatchSource$3.run(NioDispatchSource.java:209)
background log: error:  at org.fusesource.hawtdispatch.internal.SerialDispatchQueue.run(SerialDispatchQueue.java:100)
background log: error:  at org.fusesource.hawtdispatch.internal.pool.SimpleThread.run(SimpleThread.java:77)

Maybe something makes a timeout (heart beat to slow?), which causes the server to disconnect this client? If I'm using factory.setDisconnectTimeout(...) it has no effect, but maybe I'm searching at the wrong place?

For comparison: On Apache Apollo the connections remains open.

@clebertsuconic
Copy link

can you provide a simple test replicating this?

@clebertsuconic
Copy link

duh.. this is Java.. so simple to check... we should open a JIRA on artemis.

@themerius want to make the honors or should I open it?

@clebertsuconic
Copy link

@clebertsuconic
Copy link

close this one.. I will take a look through the JIRA.
I will fix it on Artemis. will get back here if I see any issues. thanks

@themerius
Copy link
Author

Thanks for your fast reply! I would happy to support you with testing the fixed Artemis.

@clebertsuconic
Copy link

@themerius :
How are you using this? I couldn't make the StompJMS to send any KEEP-alives (as I used debugging to verify this).. I may be wrong.. but I couldn't see anything.

I already see a few things wrong that need improvement on the Stomp manager, but I'm a bit confused on making the actual keep alive frames to be sent.

I added an example on master using stomp-jms, that maybe you could tweak to replicate the issue you are seeing:

https://github.com/apache/activemq-artemis/tree/master/examples/protocols/stomp/stomp-jms

Can you help on that? Otherwise I won't know how to replicate your issue.

themerius added a commit to themerius/activemq-artemis that referenced this issue Oct 12, 2015
@themerius
Copy link
Author

@clebertsuconic :
I've hacked a little bit on your example. So I've added a infinite loop to wait on messages and the possibility to send messages all second.

Have a look at my branch:
https://github.com/themerius/activemq-artemis/tree/master/examples/protocols/stomp/stomp-jms

Run with mvn verify and you get after roundabout 60 seconds the Peer disconnected exception. (It waits for messages, but no messages are currently arriving)

Run with man verify -Dtraffic=true to produce some message traffic (all second), and where will be no exception.

@jscheid
Copy link

jscheid commented Oct 21, 2015

@clebertsuconic @themerius I don't know StompJMS at all, but as far as I can tell this is working as expected on the Artemis side of things.

Artemis will close the connection if no data has been received for a certain amount of time. By default it will check every 30 seconds, and will evict connections that haven't received data in 30 seconds. So depending on when exactly you send data, connections are closed anytime between 30s-60s after data was last received.

It looks like StompJMS doesn't have any support for heart-beating (see #17) so if you're not sending anything yourself then the connection will be closed after a while.

If you want Artemis to behave more like Apollo, you can increase the timeouts. (I don't know what timeouts Apollo uses by default, but evidently they are higher.)

Alternatively, somebody could implement proper heart-beating in StompJMS or you could add poor man's heart-beating to your application code (i.e. manually sending a dummy message in regular intervals).

@themerius
Copy link
Author

@jscheid Thanks for your reply. I've feared something like that.

Is it possible to configure a acceptor in such a way that the connection TTL can be set to infinite? I've tried something like this after reading this:

<acceptor name="stomp">tcp://0.0.0.0:61613?protocols=STOMP;stompEnableMessageId=true;connectionTtl=-1</acceptor>

But it seems that this is not the right way to increase the timeouts to infinite?

But indeed, a neat solution would be to have heart beats.

@jscheid
Copy link

jscheid commented Oct 21, 2015

It looks right to me, does it not work? @clebertsuconic knows more about Artemis configuration, perhaps he can chime in.

@themerius
Copy link
Author

I'll get still Peer disconnected after about 60 seconds. Whatever I choose, -1, 10000, 999999, it still disconnects after 60 seconds. (I've used the test code from https://github.com/themerius/activemq-artemis/tree/master/examples/protocols/stomp/stomp-jms)

@clebertsuconic
Copy link

There is the master configuration ttlOverride.. not on the acceptor I'm afraid. (although it makes sense and it looks an easy change)

What happens is per definition stomp should be -1 if no TTL or ping sent (per stomp docs/spec)

and the connection should be closed through netty failures, what should happen after TCP settings.

Also, @jscheid the TTL Checker is using a Thread instead of the scheduled executors.. what won't scale to many connections... it's one thing that's need to be changed.

@jscheid
Copy link

jscheid commented Oct 21, 2015

@clebertsuconic where does it say that in the STOMP spec?

@clebertsuconic
Copy link

@jscheid I don't know ... @chirino told me :)

@jscheid
Copy link

jscheid commented Oct 21, 2015

@themerius after discussing with @clebertsuconic on IRC, it turns out that Artemis has two separate mechanisms for terminating an idle STOMP connection:

  • based on connectionTtl (or its default value)
  • based on heart-beat header

It seems to me that the correct fix is to disable the former for the STOMP protocol. Then, without a heart-beat header, you should get infinite connection life.

@themerius
Copy link
Author

@jscheid Because this library is STOMP 1.0 and sends no heart-beat header, so Artemis should make a infinite connection life? Or must this first fixed?

@jscheid
Copy link

jscheid commented Oct 23, 2015

@themerius should work once apache/activemq-artemis#208 is merged.

@clebertsuconic
Copy link

artemis should make an infinite connecotin life accordingly to Hiram

@clebertsuconic
Copy link

... and his spec

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants