-
-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix IPv6 support in Xandra.Cluster
#328
Changes from 17 commits
f5bb426
72c2a3d
66de9b7
8b7b5ff
744f0cf
d5dd7e3
02ffeb6
eb2eb9c
65df217
403abda
0cbddaf
bb3c925
da011a9
c563d73
ce0e7f8
0299f04
6b1a5ac
8340b4d
e3259e1
7b5ca9d
5545567
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -29,7 +29,8 @@ defmodule Xandra.Cluster.ControlConnection do | |
contact_node: [type: {:tuple, [{:or, [:string, :any]}, :non_neg_integer]}, required: true], | ||
connection_options: [type: :keyword_list, required: true], | ||
autodiscovered_nodes_port: [type: :non_neg_integer, required: true], | ||
refresh_topology_interval: [type: :timeout, required: true] | ||
refresh_topology_interval: [type: :timeout, required: true], | ||
use_rpc_address_for_peer_address: [type: :boolean, required: true] | ||
] | ||
|
||
defstruct [ | ||
|
@@ -52,6 +53,10 @@ defmodule Xandra.Cluster.ControlConnection do | |
# The interval at which to refresh the cluster topology. | ||
:refresh_topology_interval, | ||
|
||
# In the system.peers table use the `rpc_address` column for the | ||
# peer/Host address and not the `peer` column | ||
:use_rpc_address_for_peer_address, | ||
|
||
# The protocol module of the node we're connected to. | ||
:protocol_module, | ||
|
||
|
@@ -83,6 +88,8 @@ defmodule Xandra.Cluster.ControlConnection do | |
contact_node: contact_node, | ||
autodiscovered_nodes_port: Keyword.fetch!(options, :autodiscovered_nodes_port), | ||
refresh_topology_interval: Keyword.fetch!(options, :refresh_topology_interval), | ||
use_rpc_address_for_peer_address: | ||
Keyword.fetch!(options, :use_rpc_address_for_peer_address), | ||
connection_options: connection_opts, | ||
transport: transport | ||
} | ||
|
@@ -144,7 +151,8 @@ defmodule Xandra.Cluster.ControlConnection do | |
state.autodiscovered_nodes_port, | ||
state.protocol_module, | ||
state.ip, | ||
state.port | ||
state.port, | ||
state.use_rpc_address_for_peer_address | ||
), | ||
:ok <- Transport.setopts(state.transport, active: :once) do | ||
state = refresh_topology(state, peers) | ||
|
@@ -213,7 +221,8 @@ defmodule Xandra.Cluster.ControlConnection do | |
state.autodiscovered_nodes_port, | ||
state.protocol_module, | ||
state.ip, | ||
state.port | ||
state.port, | ||
state.use_rpc_address_for_peer_address | ||
), | ||
:ok <- register_to_events(state), | ||
:ok <- Transport.setopts(state.transport, active: :once) do | ||
|
@@ -351,21 +360,23 @@ defmodule Xandra.Cluster.ControlConnection do | |
:inet.port_number(), | ||
module(), | ||
:inet.ip_address(), | ||
:inet.port_number() | ||
:inet.port_number(), | ||
boolean() | ||
) :: | ||
{:ok, [Host.t()]} | {:error, :closed | :inet.posix()} | ||
def fetch_cluster_topology( | ||
%Transport{} = transport, | ||
autodiscovered_nodes_port, | ||
protocol_module, | ||
ip, | ||
port | ||
port, | ||
use_rpc_address | ||
) | ||
when is_integer(autodiscovered_nodes_port) and is_atom(protocol_module) do | ||
with {:ok, [local_node_info]} <- query(transport, protocol_module, @select_local_query), | ||
{:ok, peers} <- query(transport, protocol_module, @select_peers_query) do | ||
local_peer = %Host{ | ||
queried_peer_to_host(local_node_info) | ||
queried_peer_to_host(local_node_info, use_rpc_address) | ||
| address: ip, | ||
port: port | ||
} | ||
|
@@ -376,7 +387,10 @@ defmodule Xandra.Cluster.ControlConnection do | |
# https://user.cassandra.apache.narkive.com/APRtj5hb/system-peers-and-decommissioned-nodes. | ||
peers = | ||
for peer_attrs <- peers, | ||
peer = %Host{queried_peer_to_host(peer_attrs) | port: autodiscovered_nodes_port}, | ||
peer = %Host{ | ||
queried_peer_to_host(peer_attrs, use_rpc_address) | ||
| port: autodiscovered_nodes_port | ||
}, | ||
not is_nil(peer.host_id), | ||
do: peer | ||
|
||
|
@@ -412,13 +426,67 @@ defmodule Xandra.Cluster.ControlConnection do | |
end | ||
end | ||
|
||
defp queried_peer_to_host(%{"peer" => _} = peer_attrs) do | ||
defp queried_peer_to_host(%{"rpc_address" => rpc_address} = peer_attrs, true = use_rpc_address) | ||
when is_tuple(rpc_address) do | ||
{address, peer_attrs} = Map.pop!(peer_attrs, "rpc_address") | ||
peer_attrs = Map.delete(peer_attrs, "peer") | ||
peer_attrs = Map.put(peer_attrs, "address", address) | ||
queried_peer_to_host(peer_attrs, use_rpc_address) | ||
end | ||
|
||
defp queried_peer_to_host(%{"rpc_address" => _} = peer_attrs, true = use_rpc_address) do | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. When is it that we get There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. When the peers table is queried, it was using the string. For example this was one of the logs (when address is ipv6 this was another issue because the validation for ipv6 strings fails in xandra)
Now the log looks like
There is still another place where strings are being used for the original host for the control connection if I'm not mistaken but it broke a lot of tests trying to change that out. That might be good for a different PR to fix. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is the other location where string is in use https://github.com/Shimmur/xandra/blob/c563d737b99bcdec195a38408db490fec046a5e8/lib/xandra/cluster/pool.ex#L469 and I think might be good to convert this to a tuple as well. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I’m confused. If you look at the schema for the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm also adding the validate_node options. There was only one previously and it took a string https://github.com/lexhide/xandra/blob/main/lib/xandra/options_validators.ex#L51 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't know about in this particular case, but we were seeing IPv6 addresses getting returned as strings. This was then causing DNS lookups which failed. @jacquelineIO can confirm if this is the reason or not. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @whatyouhide you are right about the query returning the inet format. host = %Xandra.Cluster.Host{address: {10, 4, 0, 12}, port: 1234}
Xandra.Cluster.Host.format_address(host)
"10.4.0.12:1234" so When So now I'm confused too. I do not completely follow the flow here... I assumed this was pulling a string and wanted to make sure it was only using a tuple. BUT then also validate node started failing (if my memory is correct) so I had to update that too. So there might be multiple (at least one more) places in the code that a connection is happening, the first one being I got the desired result but the original connection is still a string. I this it makes sense to not use (I also feel like I'm talking in circles) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jacquelineIO okay yeah this code I have in Xandra is a little bit messy. I'll do another pass and fix this up later on then, so that we can standardize on having a single way of representing addresses. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @whatyouhide that went in already 8340b4d There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah cool awesome! Will review this soon then |
||
{address, peer_attrs} = Map.pop!(peer_attrs, "rpc_address") | ||
peer_attrs = Map.delete(peer_attrs, "peer") | ||
|
||
peer_attrs = | ||
case :inet.parse_address(address) do | ||
{:ok, valid_address_tuple} -> | ||
Map.put(peer_attrs, "address", valid_address_tuple) | ||
|
||
error -> | ||
Logger.error( | ||
"queried_peer_to_host: error converting address (#{inspect(address)}) to tuple, error: #{inspect(error)}" | ||
) | ||
|
||
# failed to parse, however, use what was returned in the table, see if | ||
# node_validation will validate it | ||
Map.put(peer_attrs, "address", address) | ||
end | ||
|
||
queried_peer_to_host(peer_attrs, use_rpc_address) | ||
end | ||
|
||
defp queried_peer_to_host(%{"peer" => peer} = peer_attrs, use_rpc_address) | ||
when is_tuple(peer) do | ||
{address, peer_attrs} = Map.pop!(peer_attrs, "peer") | ||
peer_attrs = Map.delete(peer_attrs, "rpc_address") | ||
peer_attrs = Map.put(peer_attrs, "address", address) | ||
queried_peer_to_host(peer_attrs) | ||
queried_peer_to_host(peer_attrs, use_rpc_address) | ||
end | ||
|
||
defp queried_peer_to_host(%{"peer" => _} = peer_attrs, use_rpc_address) do | ||
{address, peer_attrs} = Map.pop!(peer_attrs, "peer") | ||
peer_attrs = Map.delete(peer_attrs, "rpc_address") | ||
|
||
peer_attrs = | ||
case :inet.parse_address(address) do | ||
{:ok, valid_address_tuple} -> | ||
Map.put(peer_attrs, "address", valid_address_tuple) | ||
|
||
error -> | ||
Logger.error( | ||
"queried_peer_to_host: error converting address (#{inspect(address)}) to tuple, error: #{inspect(error)}" | ||
) | ||
|
||
# failed to parse, however, use what was returned in the table, see if | ||
# node_validation will validate it | ||
Map.put(peer_attrs, "address", address) | ||
end | ||
|
||
queried_peer_to_host(peer_attrs, use_rpc_address) | ||
end | ||
|
||
defp queried_peer_to_host(%{} = peer_attrs) do | ||
defp queried_peer_to_host(%{} = peer_attrs, _) do | ||
columns = [ | ||
"address", | ||
"data_center", | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the cases when we'd want to use
peer
instead ofrpc_address
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
peer
was the original implementation. I don't know why you would usepeer
instead ofrpc_address
and not much information when I google this, so going to phone a friend. @relistan do you know or think this should just solely userpc_address
and no longer usepeer
? (I think that is what @whatyouhide is asking).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that's what I’m saying. @jacquelineIO I also recommend doing just what any other C* driver maintained by Datastax is doing. I always reference the Python driver. I don't have much time to look into this until maybe Sunday, so if you want you can just take a look there and use what they're using 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like it is using
rpc_address
with a fallback topeer
ifrpc_address
isn't availableSo I can remove this flag and have a preference for using
rpc_address
, which the code is basically already set up to doThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, let's do that so that we don't have to put this decision on users 🙃 Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, catching up here. That sounds right to me, then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jacquelineIO any news on this and timeline on when I can expect these changes to go in? I'm planning the 0.18.0 release, that's why I ask 🙃