Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Xandra crashes when a newly added node sends :up message #371

Open
harunzengin opened this issue Oct 28, 2024 · 3 comments
Open

Xandra crashes when a newly added node sends :up message #371

harunzengin opened this issue Oct 28, 2024 · 3 comments
Labels

Comments

@harunzengin
Copy link
Contributor

Recently, we added a new node to Xandra which caused some issues. The trigger was this:

** (stop) exited in: :gen_statem.call(#PID<0.152339.0>, :checkout, :infinity)
** (EXIT) an exception was raised:
    ** (FunctionClauseError) no function clause matching in anonymous fn/1 in Xandra.Cluster.Pool.handle_event/4
        (xandra 0.19.0) lib/xandra/cluster/pool.ex:245: anonymous fn(nil) in Xandra.Cluster.Pool.handle_event/4
        (elixir 1.16.3) lib/map.ex:957: Map.get_and_update/3
        (elixir 1.16.3) lib/map.ex:999: Map.get_and_update!/3
        (xandra 0.19.0) lib/xandra/cluster/pool.ex:245: Xandra.Cluster.Pool.handle_event/4
        (stdlib 5.2) gen_statem.erl:1397: :gen_statem.loop_state_callback/11
        (stdlib 5.2) proc_lib.erl:241: :proc_lib.init_p_do_apply/3
(stdlib 5.2) gen.erl:246: :gen.do_call/4
(stdlib 5.2) gen_statem.erl:923: :gen_statem.call/3
(xandra 0.19.0) lib/xandra/cluster.ex:548: Xandra.Cluster.with_conn_and_retrying/3

Looking closer, this is where we get the error: https://github.com/whatyouhide/xandra/blob/main/lib/xandra/cluster/pool.ex#L245

So since we get a HOST_UP event for a machine that we don't know about, Xandra crashes. Adding nodes to the cassandra cluster shouldn't cause crashes though.

@whatyouhide
Copy link
Owner

@harunzengin yep totally. Wanna work on fixing this? I won't have bandwidth for a while 😢

@harunzengin harunzengin changed the title Xandra crashes when a newly added node gets an :up message Xandra crashes when a newly added node gets sends :up message Oct 29, 2024
@harunzengin harunzengin changed the title Xandra crashes when a newly added node gets sends :up message Xandra crashes when a newly added node sends :up message Oct 29, 2024
@harunzengin
Copy link
Contributor Author

Looking closer at this, this looks more like a failure in Cassandra Gossip rather than Xandra. We added nodes before and didn't have any problems like this.

When a new host is added, we are supposed to get a NEW_NODE event in Xandra.Cluster.ControlConnection and consequently refresh the topology. But in this case, it seems like there was an error in Cassandra, so we didn't get that event, and somehow got a HOST_UP event instead, which made Xandra crash. Not sure whether we should actually be fixing this or let it crash? @whatyouhide

@whatyouhide
Copy link
Owner

@harunzengin the C* gossip protocol seems quite messy with this kind of events. It seems to happen really often that we don't get events we're supposed to get. I think a safer course of action is to refresh the topology if we get a HOST_UP for a host we've never seen before, effectively treating it like a NEW_NODE event?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants