Skip to content
Kresten Krab Thorup edited this page Jan 5, 2015 · 1 revision

Keyspace Sync

Assumptions:

  1. For all buckets b, N_C1(b) == N_C2(b). That is, all buckets have same N-value across the two clusters to be synced.
  2. A new configuration in riak_repl section called aae_sync_ringsizes which lists the ring sizes for which tags will be generated, and thus repl_aae will be able to full sync with.
  3. The aae_sync_ringsizes config option should not include sizes that are smaller than the actual ring size (such will be warned about at startup and ignored).
  4. The actual ring size of a cluster is always included in aae_sync_ringsizes even without mention, so if the config option is missing, it's default value is exactlu [<ringsize>].

In order for two clusters to be able to sync, the two clusters must have at least one value in common in their aae_sync_ringsizes sets.

Assumption 1 (not being N-independent, only Q-independent) above is undesirable, but working avoiding that would require that all nodes are up during fullsync, since it would effectively require sync to happen only between "responsible_vnodes" (the first in a preflist).

It is still an open question weither we should sync separately for each replica, or just some replica across clusters. For now we will sync all replicas.

Notes

  • A keyspace KS is identified by a (VP/VQ) namely a virtual partition # (VP), virtual Q (VQ) pair. VQ is the smallest value in the union of the two aae_sync_ringsizes. These need to be exchanged during sync handshake; thus new protocol.
  • A concrete replicaset is further identified with an R/N value (replica # R of N), R is the position in the preflist.
  • Thus, an exact "keyspace replica" is identified using ({VP,VQ}, {R,N}).
  • Given a real ring size Q, and a VQ and NSet, we can compute a mapping from vnode id to such a concrete keyspace replica.

Status

  • Tagging hash trees is implemented.
  • Utility code to determine (keyspace x vnode) mapping implemented.

Todo:

  • Integration into riak_kv and riak_repl
  • New keyspace scheduler
  • Tests