-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Connection pool renewal after concurrent node bootstraps causes double statement execution #317
Comments
This can be easily worked around in tests, so it's low priority. |
Due to Python driver's unexpected behavior, "CREATE KEYSPACE" statement may sometimes get executed twice (scylladb/python-driver#317), leading to "Keyspace ... already exists" error in our tests (scylladb#17654). Work around this by using "IF NOT EXISTS". Fixes: scylladb#17654
Due to Python driver's unexpected behavior, "CREATE KEYSPACE" statement may sometimes get executed twice (scylladb/python-driver#317), leading to "Keyspace ... already exists" error in our tests (#17654). Work around this by using "IF NOT EXISTS". Fixes: #17654 Closes #18368
Due to Python driver's unexpected behavior, "CREATE KEYSPACE" statement may sometimes get executed twice (scylladb/python-driver#317), leading to "Keyspace ... already exists" error in our tests (scylladb#17654). Work around this by using "IF NOT EXISTS". Fixes: scylladb#17654 Closes scylladb#18368
I managed to reproduce the issue, but what's interesting it doesn't reproduce when using consistent-topology-changes feature. @kbr-scylla do you have any idea why? I am not sure if this is relevant. |
Huh? The reproducer I posted above:
runs in consistent-topology-changes mode. In fact it explicitly depends on consistent-topology-changes, because it bootstraps 2 nodes concurrently (it uses How did you reproduce the issue outside consistent-topology-changes -- did you use some different test case? Are you sure you reproduced this exact issue and not something different (maybe similar)? |
Maybe I misunderstood something, I used exactly what is in your fork, but after I added |
When was that? So I suspect that your change didn't actually change anything, but then you were unlucky with your runs and failed to reproduce it, used a too small sample, and then concluded that the change was the reason. |
Hm, actually 'experimental_features': ['udf',
'alternator-streams',
'consistent-topology-changes',
'broadcast-tables',
'keyspace-storage-options'], and you reduced it to simply But neither of these other features seems likely to be related, so I would again recheck with higher sample of runs (Preferably -- just rebase my reproducer branch on latest master, where consistent-topology-changes is no longer experimental, and check there) |
Ping on this one, the connected issue scylladb/scylladb#16219 is impacting our ability to run sct tests. |
@kostja , modifying python driver will not help with java driver failures. |
We though that this issue is a low priority one so it was not planned for this sprint, should we reconsider that? |
We want to only update the pool if previous do not exist or is shutdown. This commit adds additional validation to add_or_renew_pool to make sure this condition is met. Fixes: scylladb#317
We want to only update the pool if previous do not exist or is shutdown. This commit adds additional validation to add_or_renew_pool to make sure this condition is met. Fixes: scylladb#317
We want to only update the pool if previous do not exist or is shutdown. This commit adds additional validation to add_or_renew_pool to make sure this condition is met. Fixes: scylladb#317
@kbr-scylla I have submitted the PR with one possible solution #380 |
We want to only update the pool if previous do not exist or is shutdown. This commit adds additional validation to add_or_renew_pool to make sure this condition is met. Fixes: scylladb#317
In some cases We want to only update the pool if previous do not exist or is shutdown. This commit adds additional validation to add_or_renew_pool to make sure this condition is met when needed. Fixes: scylladb#317
In some cases We want to only update the pool if previous do not exist or is shutdown. This commit adds additional validation to add_or_renew_pool to make sure this condition is met when needed. Fixes: scylladb#317
The testcase is flaky due to a known python driver issue: scylladb/python-driver#317. This issue causes the `CREATE KEYSPACE` statement to be sometimes executed twice in a row, and the 2nd CREATE statement causes the test to fail. In order to work around it, it's enough to add `if not exists` when creating a ks. Fixes: scylladb#21034 Needs to be backported to all 6.x branches, as the PR introducing this flakiness is backported to every 6.x branch.
The testcase is flaky due to a known python driver issue: scylladb/python-driver#317. This issue causes the `CREATE KEYSPACE` statement to be sometimes executed twice in a row, and the 2nd CREATE statement causes the test to fail. In order to work around it, it's enough to add `if not exists` when creating a ks. Fixes: #21034 Needs to be backported to all 6.x branches, as the PR introducing this flakiness is backported to every 6.x branch. Closes #21056
The testcase is flaky due to a known python driver issue: scylladb/python-driver#317. This issue causes the `CREATE KEYSPACE` statement to be sometimes executed twice in a row, and the 2nd CREATE statement causes the test to fail. In order to work around it, it's enough to add `if not exists` when creating a ks. Fixes: #21034 Needs to be backported to all 6.x branches, as the PR introducing this flakiness is backported to every 6.x branch. (cherry picked from commit f847591)
The testcase is flaky due to a known python driver issue: scylladb/python-driver#317. This issue causes the `CREATE KEYSPACE` statement to be sometimes executed twice in a row, and the 2nd CREATE statement causes the test to fail. In order to work around it, it's enough to add `if not exists` when creating a ks. Fixes: #21034 Needs to be backported to all 6.x branches, as the PR introducing this flakiness is backported to every 6.x branch. (cherry picked from commit f847591)
The testcase is flaky due to a known python driver issue: scylladb/python-driver#317. This issue causes the `CREATE KEYSPACE` statement to be sometimes executed twice in a row, and the 2nd CREATE statement causes the test to fail. In order to work around it, it's enough to add `if not exists` when creating a ks. Fixes: #21034 Needs to be backported to all 6.x branches, as the PR introducing this flakiness is backported to every 6.x branch. (cherry picked from commit f847591) Closes #21107
The testcase is flaky due to a known python driver issue: scylladb/python-driver#317. This issue causes the `CREATE KEYSPACE` statement to be sometimes executed twice in a row, and the 2nd CREATE statement causes the test to fail. In order to work around it, it's enough to add `if not exists` when creating a ks. Fixes: #21034 Needs to be backported to all 6.x branches, as the PR introducing this flakiness is backported to every 6.x branch. (cherry picked from commit f847591)
The testcase is flaky due to a known python driver issue: scylladb/python-driver#317. This issue causes the `CREATE KEYSPACE` statement to be sometimes executed twice in a row, and the 2nd CREATE statement causes the test to fail. In order to work around it, it's enough to add `if not exists` when creating a ks. Fixes: #21034 Needs to be backported to all 6.x branches, as the PR introducing this flakiness is backported to every 6.x branch. (cherry picked from commit 3969ffb) Closes #21106
The testcase is flaky due to a known python driver issue: scylladb/python-driver#317. This issue causes the `CREATE KEYSPACE` statement to be sometimes executed twice in a row, and the 2nd CREATE statement causes the test to fail. In order to work around it, it's enough to add `if not exists` when creating a ks. Fixes: #21034 Needs to be backported to all 6.x branches, as the PR introducing this flakiness is backported to every 6.x branch. (cherry picked from commit 3969ffb) Closes #21134
We boot 2 Scylla nodes concurrently into existing cluster.
Python driver obtains two
on_add
notifications, one for each node.Each notification calls
add_or_renew_pool
, which creates connection pool to each node.But then, for some reason, one of the
on_add
s may cause anotheradd_or_renew_pool
to be called for the other server. This happens from_finalize_add
->update_created_pools
.This may cause a second pool to be created for the other server and the initially established pool to that server to be closed.
There could be a statement running on the initially established pool. The statement may have already been executed on Scylla side, but the driver didn't get a response yet.
The pool is closed before response arrives. This causes driver to retry the statement on the new pool, leading to double execution.
In our tests, we observe this by "CREATE KEYSPACE" statement failing with "already exists" error message (scylladb/scylladb#17654)
Reproducer:
I added a tactical sleep there:
this is just coroutinization of
create_keyspace_statement::execute
, then sleep + logging added:Test (included in the above branch):
I run it like this:
in
/home/kbr/dev/python-driver
I have the above Python driver branch checked out.Logs from example run:
scylla-10.log
scylla-9.log
scylla-3.log
topology_custom.test_double_execution.3.log
Here are the relevant excerpts cut out from the test log (those are messages I added):
What happened is that
finalize_add
for host127.58.145.10:9042
(scylla-10) causedupdate_created_pools
call, which calledadd_or_renew_pool
for host127.58.145.9:9042
(scylla-9). But pool for scylla-9 was already established. We start running "create keyspace" on scylla-9. In the meantime,add_or_renew_pool
establishes a new pool and drops the old one, causing "create keyspace" to be retried on the new pool, leading to double execution:The text was updated successfully, but these errors were encountered: