You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This peace of the code creates a split in DCAwareRoundRobinPolicy.distance behavior for case when used_hosts_per_remote_dc is not empty:
if node was added to the policy and remote dc bucket does not contain more than used_hosts_per_remote_dc it will return HostDistance.REMOTE
Otherwise it will return HostDistance.IGNORE
Which means that if code uses .distance before on_add it will yield wrong result if this node fits into used_hosts_per_remote_dc.
Now you should always consider it when using lbp.distance.
Just one exampe that I managed to eyeball is in Cluster.on_add:
As I understand it was done to signal upper logic to connect only to a certain number of remote dc nodes.
My proposal, instead of going around the code and fixing distance usage to make sure that everywhere it is used to decide whether node is worth of connecting to or not, to delegate this functionality to another method on lbp like should_connect(host):bool and use it there for that purpose, removing self._dc_live_hosts from distance calculation.
Hmm... by looking at the code it looks like you are right.
But if that were the case, then the driver would never connect to nodes outside the current DC, right?
I don't think that's the case, so maybe we are somehow misunderstanding the code - or it is the case and somehow no one noticed and no test checked this?
We should check if that really is the case or if we misunderstood something before trying to fix it.
If we were to fix this: I'd be strongly against other method. There should be only one (or maybe a few, but a low number) paths were nodes are added to policy. Those should be fixed (which will improve the driver and possibly uncover other bugs by reading code) instead of complicating code.
Code under question is in the
DCAwareRoundRobinPolicy.distance
:python-driver/cassandra/policies.py
Lines 257 to 264 in 1c3cff8
This peace of the code creates a split in
DCAwareRoundRobinPolicy.distance
behavior for case whenused_hosts_per_remote_dc
is not empty:used_hosts_per_remote_dc
it will returnHostDistance.REMOTE
HostDistance.IGNORE
Which means that if code uses
.distance
beforeon_add
it will yield wrong result if this node fits intoused_hosts_per_remote_dc
.Now you should always consider it when using
lbp.distance
.Just one exampe that I managed to eyeball is in
Cluster.on_add
:python-driver/cassandra/cluster.py
Lines 2067 to 2089 in 1c3cff8
So it won't finalize properly for node from remote dc and won't run
self._prepare_all_queries(host)
for it.Just becase it is doing:
Instead of
The text was updated successfully, but these errors were encountered: