A test harness for Percona Server with Group Replication, verifying isolation levels up to strict serializable.
For example, try:
lein run test --time-limit 300 --isolation repeatable-read --expected-consistency-model repeatable-read --rate 1000 --concurrency 3n --select-for share --nemesis partition,recover
-w list-append
runs an Elle list-append workload, where each list is stored
in a row like (primary-key, secondary-key, value)
. Primary and secondary key
are identical; access by secondary key tests index consistency. Values are
strings, and appending elements to them is done with SQL CONCAT
. See jepsen.percona-gr.list-append
for details.
-w rw-register
is a read-write register--in most respects it's just like
list-append, but its value is an integer which is overwritten by blind writes,
rather than appended to using string concat. Inference here is weaker than
list-append, but it may expose different behavior.
The usual pause
, kill
, partition,
and clock
introduce process pauses,
crashes, network partitions, and clock skew, respectively. recover
periodically checks to see if the cluster has fallen over, and tries to recover
it by restarting group replication from scratch. You'll probably have to mix
recover
with most fault modes, because GR is fragile. Recovery is also
incredibly slow--300+ seconds is common. Make sure to choose a time limit long
enough.
See lein run test --help
for full options.
abort-probability
is the probability that we choose to spuriously abort a
transaction while executing it. We do this to detect aborted reads.
--concurrency 3n
means "run three worker threads for each node".
There are two strategies for writing values. --[no-]on-dup-key
enables or
disables writing using INSERT ... ON DUPLICATE KEY
. --[no-]update-insert
enables or disables writing using an UPDATE
first, and backing off to an
INSERT
if that fails, and if THAT fails, trying an UPDATE
again.
--expected-consistency-model
controls the level of consistency we check for.
By default this is strict-serializable
, but you could also choose any Elle
consistency model, like strong-session-snapshot-isolation
or
read-committed
.
--isolation
controls the isolation level we request from Percona during transactions. This should be read-uncommitted
, read-committed
, repeatable-read
, or serializable
.
--innodb-flush-method METHOD
allows changing how InnoDB flushes data to disk.
--lazyfs
mounts the Percona data directory in a lazyfs filesystem, and changes the kill
nemesis fault to also lose un-fsynced writes.
--max-txn-length LEN
and --max-writes-per-key COUNT
control the upper
bounds on micro-ops per transaction and how many writes we attempt on a given
key before choosing a new key.
--nemesis FAULTS
takes a comma-separated list of faults to inject, like
kill,partition,recover
. --nemesis-interval SECONDS
controls roughly how
long it takes between nemesis operations (in a given fault class).
--no-db
skips database teardown and setup. This will almost certainly result
in corrupt results: false positives, etc. However, Percona is so slow to set up
that this option is helpful for debugging workloads.
--[no-]predicate-reads
allows disabling accessing rows by secondary key, rather than primary key.
--table-count NUM
controls the number of tables we split rows across.
--rate HZ
is the upper bound on how many ops/sec Jepsen tries to perform.
--recover-interval
is how often the recovery nemesis checks to see if it needs to recover the cluster.
--recovery-time
controls how long we wait for the cluster to heal at the
end of the test.
--select-for MODE
, if set, uses SELECT ... FOR UPDATE or FOR SHARE on
selects. Using share
or update
are applied to every select operation.
share+update
does a FOR UPDATE
on rows that the transaction will write
later, and FOR SHARE
otherwise.
--single-node
runs just one Percona node, and skips all the group replication
setup.
--time-limit SECONDS
controls how long the test runs for.
--inter-mop-delay MS
introduces randomized delays (exponentially distributed) between each micro-op in a transaction. Helpful for increasing the window during which we can observe isolation violations.
--workload NAME
chooses which workload to run.
Copyright © 2022 Jepsen, LLC
This program and the accompanying materials are made available under the terms of the Eclipse Public License 2.0 which is available at http://www.eclipse.org/legal/epl-2.0.
This Source Code may also be made available under the following Secondary Licenses when the conditions for such availability set forth in the Eclipse Public License, v. 2.0 are satisfied: GNU General Public License as published by the Free Software Foundation, either version 2 of the License, or (at your option) any later version, with the GNU Classpath Exception which is available at https://www.gnu.org/software/classpath/license.html.