Skip to content

Commit

Permalink
added CDC to explore section, changed tunable reads to follower reads…
Browse files Browse the repository at this point in the history
… and fixed minor issues (yugabyte#3376)
  • Loading branch information
schoudhury authored Jan 17, 2020
1 parent f9adc59 commit 349b53c
Show file tree
Hide file tree
Showing 30 changed files with 275 additions and 193 deletions.
4 changes: 2 additions & 2 deletions architecture/YSQL-Features-Supported.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ YSQL uses the query layer from PostgreSQL v11.2, and intends to support most Pos
## PostgreSQL Feature Support

Here are the features currently supported as of YugaByte DB v2.0, Jan 15 2020. This list also indicates what is planned for YugaByte DB v2.1 coming out around the beginning of February.
Here are the features currently supported as of YugabyteDB v2.0, Jan 15 2020. This list also indicates what is planned for YugabyteDB v2.1 coming out around the beginning of February.

- [ ] All data types
- [x] Basic types
Expand Down Expand Up @@ -143,4 +143,4 @@ Here are the features currently supported as of YugaByte DB v2.0, Jan 15 2020. T
- [ ] Framework to publish TPCC performance numbers with each release
- [ ] Available as rpm/deb/container/yum/brew

[![Analytics](https://yugabyte.appspot.com/UA-104956980-4/architecture/YSQL-Features-Supported.md?pixel&useReferer)](https://github.com/YugaByte/ga-beacon)
[![Analytics](https://yugabyte.appspot.com/UA-104956980-4/architecture/YSQL-Features-Supported.md?pixel&useReferer)](https://github.com/yugabyte/ga-beacon)
14 changes: 7 additions & 7 deletions architecture/benchmarks/yb-perf-v1.0.7.md
Original file line number Diff line number Diff line change
@@ -1,24 +1,24 @@
# Summary

3 node, 16 vCPUs. Each write is replicated 3 ways internally. Each key-value is around 64 bytes combined. See [setup details](https://github.com/YugaByte/yugabyte-db/blob/master/docs/yb-perf-v1.0.7.md#setup) for more info.
3 node, 16 vCPUs. Each write is replicated 3 ways internally. Each key-value is around 64 bytes combined. See [setup details](https://github.com/yugabyte/yugabyte-db/blob/master/docs/yb-perf-v1.0.7.md#setup) for more info.

* CassandraKeyValue (see [details](https://github.com/YugaByte/yugabyte-db/blob/master/docs/yb-perf-v1.0.7.md#cassandrakeyvalue))
* CassandraKeyValue (see [details](https://github.com/yugabyte/yugabyte-db/blob/master/docs/yb-perf-v1.0.7.md#cassandrakeyvalue))
* 97K writes/sec at 2.6ms (256 writers)
* 220K reads/sec at 1.2ms (256 readers)
* CassandraSecondaryIndex (see [details](https://github.com/YugaByte/yugabyte-db/blob/master/docs/yb-perf-v1.0.7.md#cassandrasecondaryindex))
* CassandraSecondaryIndex (see [details](https://github.com/yugabyte/yugabyte-db/blob/master/docs/yb-perf-v1.0.7.md#cassandrasecondaryindex))
* 5.9K writes/sec at 10.7ms (64 writers)
* 200K reads/sec at 1.3ms (256 readers)
* CassandraBatchKeyValue (see [details](https://github.com/YugaByte/yugabyte-db/blob/master/docs/yb-perf-v1.0.7.md#cassandrabatchkeyvalue))
* CassandraBatchKeyValue (see [details](https://github.com/yugabyte/yugabyte-db/blob/master/docs/yb-perf-v1.0.7.md#cassandrabatchkeyvalue))
* 220K writes/sec at 14ms (32 writes)
* 258K writes/sec at 24ms (64 writers)
* RedisKeyValue (see [details](https://github.com/YugaByte/yugabyte-db/blob/master/docs/yb-perf-v1.0.7.md#rediskeyvalue))
* RedisKeyValue (see [details](https://github.com/yugabyte/yugabyte-db/blob/master/docs/yb-perf-v1.0.7.md#rediskeyvalue))
* 89K writes/sec at 2.9ms (256 writers)
* 170K reads/sec at 1.5ms (256 readers)
* RedisPipelinedKeyValue (see [details](https://github.com/YugaByte/yugabyte-db/blob/master/docs/yb-perf-v1.0.7.md#redispipelinedkeyvalue))
* RedisPipelinedKeyValue (see [details](https://github.com/yugabyte/yugabyte-db/blob/master/docs/yb-perf-v1.0.7.md#redispipelinedkeyvalue))
* 536K writes/sec at 21ms (24 writers)
* 538K reads/sec at 14ms (16 readers)

See [YCSB results](https://github.com/YugaByte/yugabyte-db/blob/master/docs/yb-perf-v1.0.7.md#ycsb-run-info).
See [YCSB results](https://github.com/yugabyte/yugabyte-db/blob/master/docs/yb-perf-v1.0.7.md#ycsb-run-info).

# Setup:

Expand Down
6 changes: 3 additions & 3 deletions architecture/cloud-machine-types-analysis.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
# AWS

This is an analysis of the most cost-effective AWS machines to run YugaByte DB.
This is an analysis of the most cost-effective AWS machines to run YugabyteDB.

- 3 nodes of 16 cores each with replication factor 3, fully cached workload can handle 100K reads and 20K writes.
See [this performance report](https://github.com/YugaByte/yugabyte-db/blob/master/docs/yb-perf-0.9.5rc-Feb-13.md) for details.
See [this performance report](https://github.com/yugabyte/yugabyte-db/blob/master/docs/yb-perf-0.9.5rc-Feb-13.md) for details.
- 3 nodes of 8 cores each with replication factor 3, fully uncached workload with 1.4TB total can handle77K read ops/sec.
See [this post](https://blog.yugabyte.com/achieving-sub-ms-latencies-on-large-data-sets-in-public-clouds-bf38d13ac42d) for more details.
- The app needs between 1TB and 2TB of storage per node.
- Analyzed many other machine types, concluded they are not as effective to run YugaByte DB
- Analyzed many other machine types, concluded they are not as effective to run YugabyteDB

```
machine-type vCPUs memory raw-cost storage-cost Total (1TB) Total (2TB)
Expand Down
2 changes: 1 addition & 1 deletion architecture/design/README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
This directory contains design documents with details of how various features work internally. The intended audience for these documents are commiters to the codebase and users wanting a deep understanding of what happens under the hood for various features.

You can find the user-facing [YugaByte DB docs here](https://docs.yugabyte.com/). If you want to understand the architecture of YugaByte DB, the [architecture section in the docs](https://docs.yugabyte.com/latest/architecture/) is a good place to start.
You can find the user-facing [YugabyteDB docs here](https://docs.yugabyte.com/). If you want to understand the architecture of YugabyteDB, the [architecture section in the docs](https://docs.yugabyte.com/latest/architecture/) is a good place to start.
2 changes: 1 addition & 1 deletion architecture/design/docdb-automatic-tablet-splitting.md
Original file line number Diff line number Diff line change
Expand Up @@ -143,4 +143,4 @@ of the nodes.
Bulk load tool relies on partitioning to be fixed during the load process, so we decided to pre-split and disable
dynamic splitting during bulk load.

[![Analytics](https://yugabyte.appspot.com/UA-104956980-4/architecture/design/docdb-automatic-tablet-splitting.md?pixel&useReferer)](https://github.com/YugaByte/ga-beacon)
[![Analytics](https://yugabyte.appspot.com/UA-104956980-4/architecture/design/docdb-automatic-tablet-splitting.md?pixel&useReferer)](https://github.com/yugabyte/ga-beacon)
10 changes: 5 additions & 5 deletions architecture/design/docdb-change-data-capture.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,18 @@
# Change Data Capture in YugaByte DB
# Change Data Capture in YugabyteDB

**Change data capture** (or **CDC** for short) enables capturing changes performed to the data stored in YugaByte DB. This document provides an overview of the approach YugaByte DB uses for providing change capture stream on tables that can be consumed by third party applications. This feature is useful in a number of scenarios such as:
**Change data capture** (or **CDC** for short) enables capturing changes performed to the data stored in YugabyteDB. This document provides an overview of the approach YugabyteDB uses for providing change capture stream on tables that can be consumed by third party applications. This feature is useful in a number of scenarios such as:

### Microservice-oriented architectures

There are some microservices that require a stream of changes to the data. For example, a search system powered by a service such as Elasticsearch may be used in conjunction with the database stores the transactions. The search system requires a stream of changes made to the data in YugaByte DB.
There are some microservices that require a stream of changes to the data. For example, a search system powered by a service such as Elasticsearch may be used in conjunction with the database stores the transactions. The search system requires a stream of changes made to the data in YugabyteDB.

### Asynchronous replication to remote systems

Remote systems such as caches and analytics pipelines may subscribe to the stream of changes, transform them and consume these changes.

### Two data center deployments

Two datacenter deployments in YugaByte DB leverage change data capture at the core.
Two datacenter deployments in YugabyteDB leverage change data capture at the core.

> Note that in this design, the terms "data center", "cluster" and "universe" will be used interchangeably. We assume here that each YB universe is deployed in a single data-center.
Expand Down Expand Up @@ -159,4 +159,4 @@ Note that once you have received a change for a row for some timestamp t, you wi



[![Analytics](https://yugabyte.appspot.com/UA-104956980-4/architecture/design/docdb-change-data-capture.md?pixel&useReferer)](https://github.com/YugaByte/ga-beacon)
[![Analytics](https://yugabyte.appspot.com/UA-104956980-4/architecture/design/docdb-change-data-capture.md?pixel&useReferer)](https://github.com/yugabyte/ga-beacon)
18 changes: 9 additions & 9 deletions architecture/design/docdb-encryption-at-rest.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Encryption At Rest in YugaByte DB
# Encryption At Rest in YugabyteDB

Data at rest within a YugaByte DB cluster should be protected from unauthorized users by encrypting it. This document outlines how this is achieved internally, along with what features we support.
Data at rest within a YugabyteDB cluster should be protected from unauthorized users by encrypting it. This document outlines how this is achieved internally, along with what features we support.

## Features

Expand All @@ -23,7 +23,7 @@ This feature makes the following assumptions:

## Basic Concepts

There are two types of keys to encrypt data in YugaByte DB:
There are two types of keys to encrypt data in YugabyteDB:
* **Universe key**: Top level symmetric key used to encrypt other keys (see data keys below), these are common to the cluster.
* **Data key**: Symmetric key used to encrypt the data. There is one data key generated per flushed file.

Expand Down Expand Up @@ -121,13 +121,13 @@ we would make appropriate API calls to create a new Universe Key and use that ke
taking with some of the KMS system that we would support via Platform.

## Equinix [SmartKey](https://www.equinix.com/services/edge-services/smartkey/). Integration
SmartKey is KMS a offering from Equinix, they provide SDK and API to manage the keys in their platform, YugaByte platform would integrate with SmartKey via the REST API route and authenticate
SmartKey is KMS a offering from Equinix, they provide SDK and API to manage the keys in their platform, Yugabyte platform would integrate with SmartKey via the REST API route and authenticate
using their API key in order to manage the Keys. We would use the name attribute on the Key to link the universe that the key is generated for. Once the key is generated we would make appropriate RPC
calls to YugaByte to enable encryption. We would call their rekey api when the user wants to rekey the universe and update the YugaByte nodes in a rolling fashion.
calls to YugabyteDB to enable encryption. We would call their rekey api when the user wants to rekey the universe and update the YugabyteDB nodes in a rolling fashion.

## AWS [Key Management Service](https://docs.aws.amazon.com/kms/latest/developerguide/overview.html)
Amazon offers their KMS solution, we will your their KMS api to manage the keys, And they have the concept of aliases which we would use that to build a relationship between the key and universe.
When the key needs to be rotated we would create a new key and update the alias accordingly. And do the update on YugaByte nodes in a rolling fashion.
When the key needs to be rotated we would create a new key and update the alias accordingly. And do the update on YugabyteDB nodes in a rolling fashion.

# Implementation Internals

Expand Down Expand Up @@ -172,7 +172,7 @@ message EncryptionHeaderPB {

We only store the universe key version in the file header, so the actual universe key data will never be persisted with the tablet data!

An `Env` object sits between YugaByte and the filesystem and is responsible for creating new files. YugaByte has the notion of an encrypted and plaintext Env, corresponding to the type of file it creates. The encrypted Env owns an object that listens on heartbeat and is responsible for fetching universe keys. The encrypted Env creates new files as follows:
An `Env` object sits between YugabyteDB and the filesystem and is responsible for creating new files. YugabyteDB has the notion of an encrypted and plaintext Env, corresponding to the type of file it creates. The encrypted Env owns an object that listens on heartbeat and is responsible for fetching universe keys. The encrypted Env creates new files as follows:

### Writable File Creation

Expand Down Expand Up @@ -201,6 +201,6 @@ Since the nth byte of tablet data in an encrypted file corresponds to offset n +
# Future Work

* Enable using a KMIP server for the universe key
* Make YugaByte platform act like a KMIP server which would encapsulate different KMS systems and give one common interface for YugaByte to interact.
* Make Yugabyte platform act like a KMIP server which would encapsulate different KMS systems and give one common interface for YugabyteDB to interact.

[![Analytics](https://yugabyte.appspot.com/UA-104956980-4/architecture/design/docdb-encryption-at-rest.md?pixel&useReferer)](https://github.com/YugaByte/ga-beacon)
[![Analytics](https://yugabyte.appspot.com/UA-104956980-4/architecture/design/docdb-encryption-at-rest.md?pixel&useReferer)](https://github.com/yugabyte/ga-beacon)
2 changes: 1 addition & 1 deletion architecture/design/docdb-index-backfill.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@
> **Note:** This design doc is still being worked on.

[![Analytics](https://yugabyte.appspot.com/UA-104956980-4/architecture/design/docdb-online-index-backfill.md?pixel&useReferer)](https://github.com/YugaByte/ga-beacon)
[![Analytics](https://yugabyte.appspot.com/UA-104956980-4/architecture/design/docdb-online-index-backfill.md?pixel&useReferer)](https://github.com/yugabyte/ga-beacon)
Loading

0 comments on commit 349b53c

Please sign in to comment.