Skip to content

Storage v0.2 Overview

Paul Lorenz edited this page Apr 17, 2023 · 9 revisions

Motivation

Context Threading

Raft Index

The primary goal for v0.2 was to be able to thread the raft index whatever raft command is executing through a bbolt transaction, so that we can store the current raft index in the DB. This will let us know what state we're in at start and will let us skip applying commands that we've already applied. If we can do that, we don't have to start with a blank DB every time. If we can do that, the we can apply DB updates at startup, as we've always done in the past, instead of trying to figure out where we were in the raft command stream.

If we're trying to thread some context through the raft transaction, we might as well try to solve some other problems while we're at it.

Change Attribution

We need to generate entity change events and as part of that we want to be able to attribute changes to their authors. To do that we need to inject some state into the transaction. Ideally we could use the same mechanism that we use for the raft index.

Aggregate Commands

There's been some discussion of adding some form of manifest, where you could post a set of commands or target state. If we can update the API now to where we could facilitate that, we should consider it.

Opportunistic Updates

Generics

The store API was written pre-generics. If we can make the API safer or more expressive using generics, let's do so. Specifically we should be able to find/load/list typed entities instead of using bbolt.Entity. The listener API, which uses a third party non-generic library is also a good candidate for generics. This would simplify the consumer APIs quite a bit.

General API tidying

If we can remove unused bits or reorganize things to make them clearer or simpler, let's do so.

Implementation

Context Threading

The v0.1 API for update transactions looks like:

type Db interface {
	Update(fn func(tx *bbolt.Tx) error) error
        ...
}

which matches the bbolt.Db interface. The create/update/delete operations on store do take a MutateContext, which looks like:

type MutateContext interface {
	Tx() *bbolt.Tx
	AddEvent(em events.EventEmmiter, name events.EventName, entity Entity)
	IsSystemContext() bool
	GetSystemContext() MutateContext
}

This has some state, in that you can register events to be fired on transaction commit. However, other state cannot be passed through. To achieve our goals, then, these two APIs were updated.

type MutateContext interface {
	Tx() *bbolt.Tx
	AddPreCommitAction(func(ctx MutateContext) error)
	runPreCommitActions() error
	AddCommitAction(func())
	setTx(tx *bbolt.Tx) MutateContext
	IsSystemContext() bool
	GetSystemContext() MutateContext
	Context() context.Context
	UpdateContext(func(ctx context.Context) context.Context) MutateContext
}

The MutateContext now contains a context.Context, which allows us to pass through arbitrary state. The fabric and edge projects care about the specifics of the state, but in storage we just want some mechanism to store the state and aren't going to dictate the format of it. We've also generalized the eventing so you can add a pre-commit or post-commit action, which could be event related or not.

NOTE: Replacing MutateContext with context.Context was considered, but would have been very unergonomic. context.Context is entirely untyped so every time we wanted to grab the Tx we would have call some helper function to do it. Every use of the context would have involved casting and helper functions. Here we've go the context, and if need be we could turn things inside-out and store the MutateContext inside the context.Context if we needed to pass through to some context oriented function.

The Update method has been changed to take a MutateContext as does the func called inside the transaction.

type Db interface {
	Update(ctx MutateContext, fn func(ctx MutateContext) error) error
        ...
}

This allows us to pass state into the transaction inside a MutateContext. The Update method is also smart and won't start an update transaction if we're already inside of one. This will let us call multiple commands in the same transaction, without having to change the APIs.

change.Context

The storage library doesn't care about the data being threaded through. That will be managed by the fabric and edge components. Rather than storing each piece of data independently in the context.Context we're going to use a data structure to store things together.

type Context struct {
	Attributes map[string]string
	RaftIndex  uint64
}

We have some utility methods to make using a change context more ergnomic. Here is how a change context is created for edge REST requests.

func (rc *RequestContext) NewChangeContext() *change.Context {
	src := fmt.Sprintf("rest[auth=edge/host=%v/method=%v/remote=%v]", rc.GetRequest().Host, rc.GetRequest().Method, rc.GetRequest().RemoteAddr)
	changeCtx := change.New().SetSource(src)

	if rc.Identity != nil {
		changeCtx.SetChangeAuthorId(rc.Identity.Id).SetChangeAuthorName(rc.Identity.Name).SetChangeAuthorType("identity")
	}

	if rc.Request.Form.Has("traceId") {
		changeCtx.SetTraceId(rc.Request.Form.Get("traceId"))
	}
	return changeCtx
}

We also have some ways to easily move to and from context.Context or MutateContext. We also need to be able to thread contexts through raft commands. This is why there are methods to translate *change.Context instances to and from protobuf.

func (self *Context) GetContext() context.Context { ... }
func (self *Context) NewMutateContext() boltz.MutateContext { ... }
func (self *Context) AddToContext(ctx context.Context) context.Context { ... }
func (self *Context) ToProtoBuf() *cmd_pb.ChangeContext { ... }

func FromContext(ctx context.Context) *Context { ... }
func FromProtoBuf(ctx *cmd_pb.ChangeContext) *Context { ... }

Generics/Store API Changes

In 0.1 we had ListStore and CrudStore. ListStore was focused on store configuration and querying operations while CrudStore extended ListStore with CRUD operations.

This separation was a function of how the library developed and wasn't a useful distinction. In 0.2 we have the following primary interfaces

  • Store - All the non-generic store-user functions
  • ConfigurableStore - Extends Store with methods needed to configure a store. Add*Symbol, etc
  • storeInternal - Operations only needed internally. Probably don't need to split of, but keeps them grouped
  • EntityStore - Extends Store with generic operations on the entity type managed by the store

The reason we need Store and EntityStore is that Go doesn't have covariant parameters. So if you have a method which takes Store[any] you can't pass in a Store[*Service]. This means if we don't know the generic type we need an non-generic interface we can use to access the non-generic methods.

Child Store Rules

The 0.2 release formalizes the rules for how parent/child stores interact. This is important to make sure that events are fired where appropriate, and in the right order and to make sure that logic is consistent. Where is is confusing is that if you have an entity which is manged by a child store (for example EdgeService, which is a child of Service), the entity can be updated or deleted using either the parent or child store.

  • Create - Create always happens from correct store. For example a *Service can only be created using the parent store and an *EdgeService can only be created using the child store. This ensure that everything happens consistently.
  • Update - Updates for child entities should always happen using the Update logic of the child store. So if Update is called using the parent store, it should in some way delegate to the child store.
  • Delete - Deletes should always be run by the parent store, since it will delete the root bucket. However, it must call into the child store to allow additional cleanup to happen.

To enable these rules, there's a new interface:

type ChildStoreStrategy[E Entity] interface {
	HandleUpdate(ctx MutateContext, entity E, checker FieldChecker) (bool, error)
	HandleDelete(ctx MutateContext, entity E) error
	GetStore() Store
}

When Update is called on the parent store, it will check will all registered child stores using HandleUpdate to see if anyone of them claims the entity and wants to take over the update. If not, it will do the update itself.

Likewise when Delete is called on a child store, it will delegate to the parent store. However, the parent will also call HandleDelete on all child stores to allow them to run any additional clean logic. Indexes of child stores will be regardless and don't need to be handled explicitly in HandleDelete.

EntityStrategy

Pre-generics, loading and persisting entities was handled by the entities themselves. Now, it's delegated to a strategy type.

type EntityStrategy[E Entity] interface {
	NewEntity() E
	FillEntity(entity E, bucket *TypedBucket)
	PersistEntity(entity E, ctx *PersistContext)
}

In most cases, the EntityStrategy will likely be the store itself.

This also removed one of the main uses of the impl EntityStore[E], which we had been using to instantiate new Entities of the appropriate store type. If though it's no longer strictly necessary, we're still keeping in place, so that when we call store methods internally that may have been overridden by the implementation, we get the right method.

In a future revision, we should probably remove this and require that overrides be done by implementing methods on the entity strategy or child store strategy.

EntityConstraint

To implement generic listeners there's a new EntityConstraint interface:

type EntityConstraint[E Entity] interface {
	ProcessPreCommit(state *EntityChangeState[E]) error
	ProcessPostCommit(state *EntityChangeState[E])
}

type EntityChangeState[E Entity] struct {
	Id           string
	Ctx          MutateContext
	InitialState E
	FinalState   E
	store        *BaseStore[E]
	ChangeType   EntityEventType
}

This enable events to be registered and get the full pre and post change states, as well as be notified pre and post-commit.

Store Definition

Storage v0.2 also a new configuration struct for defining stores.

type StoreDefinition[E Entity] struct {
	EntityType      string
	EntityStrategy  EntityStrategy[E]
	BasePath        []string
	Parent          Store
	ParentMapper    func(Entity) Entity
	EntityNotFoundF func(id string) error
}

This helped simplify store setup a bit.