Use RWLock to prevent races in inferno-vc-server #36

siddharth-krishna · 2023-02-27T10:13:39Z

This PR uses a read-write lock to protect concurrent file IO performed by inferno-vc-server. This should hopefully prevent any concurrency bugs caused by two people trying to save/update/delete a script at the same time. (#19 )

The writeup below is a description of the current implementation, its issues, and some proposed solutions. This PR corresponds to Option 0.

Inferno Version Control Store

A simple version control for scripts.

store(object, predecessor) -> hash
fetch(hash) -> object
fetchHist(hash) -> [object]
del(hash)

Current Implementation

Data structure: uses the file system. Objects are stored in the root directory with their hash as filenames, the heads/ directory contains one file for each script history named by the hash of the head of that history, and the to_head/ directory maps every script to the head of the history it belongs to:

/vc_store/
├── 2ZDR-_h3LhqkDsk3qIOx7gobNrAvwbSA67aakIQFabI=
├── Bu1zONJHnlYCqqACpMeJ4pac_V2jmMOYISqyAvaSoBs=
├── FqBr0k0PQ33qskv9c6z9IsKE8jbTkgMyHx_iqyJFXLc=
├── IlRYIOwovxwbHWor54PHrYWQL_pZVWCyFGw_lmvAGjI=
├── heads
│   ├── Bu1zONJHnlYCqqACpMeJ4pac_V2jmMOYISqyAvaSoBs
│   ├── IlRYIOwovxwbHWor54PHrYWQL_pZVWCyFGw_lmvAGjI=
└── to_head
    ├── 2ZDR-_h3LhqkDsk3qIOx7gobNrAvwbSA67aakIQFabI=
    ├── Bu1zONJHnlYCqqACpMeJ4pac_V2jmMOYISqyAvaSoBs=
    ├── FqBr0k0PQ33qskv9c6z9IsKE8jbTkgMyHx_iqyJFXLc=
    ├── IlRYIOwovxwbHWor54PHrYWQL_pZVWCyFGw_lmvAGjI=

Store method:

def store(o, h_p):
    h_o = writeToStore(o)
    f_p = 'vc_store/heads/{h_p}'
    if fileExists(f_p):
        f_o = 'vc_store/heads/{h_o}'
        renameFile(f_p, f_o)
        appendFile(f_o, h_p)
        # Update to_head pointers:
        for h in readFile(f_o):
            writeFile('vc_store/to_head/{h}', h_o)

The to_head mappings are used by the fetchHist(h) operation, which uses the mapping to find the head of the given object and then reads the object's history from the file in the heads/ directory.

Issues with current implementation:

The renameFile and appendFile are not in an atomic block. This means by the time an operation tries to append to the file, the next store operation could have already renamed the file to something else, meaning the first operation's predecessor would be lost from the history.
The update of to_head pointers of one operation store(o2, o1) can race with the successive operation store(o3, o2). If the latter overtakes the former, this will result in some objects in the history incorrectly pointing to o2 as their head instead of o3.
Crash safety: crashes, for example between renameFile and appendFile, will leave the store in an inconsistent state.

Option 0: slap a lock around all operations

Pros: Easy fix, no need to migrate current store
Cons: inefficient

Option 1: fix the current implementation

Idea: all versions of a script share a unique histID. Instead of to_head, you have a meta field to_hist in the object file as this is stable.
Maintain a hist_to_head map in memory that maps each histID to the hash of its head, and use Control.Concurrent.Lock or an MVar to update this map atomically.
store(o, p) writes a new head file for o, but retries until it can successfully update hist_to_head.
fetchHist(h) looks up the histID from the meta file, finds the head from hist_to_head, and returns the history saved in the appropriate head file.
hist_to_head can be periodically saved to file and on startup the last snapshot can be loaded and updated if necessary (as it is easy to reconstruct it from all object metas).
Pros: small migration from current store, most operations don't need lock so is relatively efficient.
Cons: need to carefully check/prove concurrent correctness, needs migration
What happens if there is a crash, and 2 concurrent new HEADs? Recovery can pick arbitrary one. Or discard both.
Use a checksum to detect crash. Remember crash can happen when snapshotting! (How does VPDB deal with this?)

def store(o, h_p):
    hist_id = readStore(h_p)['hist_id']
    h_o = writeToStore(o['hist_id' := hist_id])
    if hist_to_head['hist_id'] == h_p:
        f_p = 'vc_store/heads/{h_p}'
        f_o = writeFile('vc_store/heads/{h_o}', readFile(f_p) ++ h_p)
        if CAS(hist_to_head['hist_id'], h_p, h_o):
            return Success
    return TriedToAppendToNonHeadError

Option 2: use existing file-backed DB

For example: have a DB table with columns: hash, pred, isHead, object, histID
Store adds row to table and unsets isHead of pred (atomically)
Index can be hash, so fetch is fast
All versions of a script share a unique histID, so fetchHist(h) looks up histID of h and fetches all rows with that histID.
DB is periodically backed up to file
Pro: concurrent correctness and efficiency guaranteed by DB
Con: making all operations atomic might be slower than a correct implementation of Option 1
Con: needs migration of store

Option 3: STM

Can use an in-memory STM data structure and make each operation atomic
Like VPDB, have a thread save a snapshot of the structure to file periodically
Pro: STM ensures no concurrency bugs
Con: entire store needs to fit in memory
Con: needs migration of store

Testing: are there libraries to test for such concurrency bugs?

For example, something that runs a random sequence of operations in a single-thread and in a concurrent setting and compares outputs.

siddharth-krishna · 2023-02-27T10:15:28Z

inferno-vc/src/Inferno/VersionControl/Operations.hs

+    either (throwError . CouldNotDecodeObject h) pure =<< liftIO (eitherDecode <$> BL.readFile fp)
+
+-- | Fetch an object WITHOUT holding any locks. This is used by the cached client, which
+-- is safe since the cache is read only.


@goodlyrottenapple is this correct?

shulhi · 2023-02-28T23:16:02Z

inferno-vc/src/Inferno/VersionControl/Operations.hs

@@ -387,18 +427,30 @@ getAllHeads = do
      []
      (map takeFileName headsRaw)

-fetchFunctionsForGroups :: (VCStoreLogM env m, VCStoreErrM err m, VCStoreEnvM env m, Ord g, FromJSON a, FromJSON g) => Set.Set g -> m [VCMeta a g VCObjectHash]
+-- | Fetch all objects that are public or that belong to the given set of groups.
+-- Note this is a potentially long operation so no locks are held while traversing the


I agree with this (at least for now). We are calling this function multiple times in Onping's selector. Lock will increase the respond time with each query.

shulhi · 2023-02-28T23:53:18Z

I'm thinking whether we can use a different lock for each head. Currently, the lock is implemented for the whole store.

The idea is that scripts mostly form a disjoint set. Each set consists of the head and all of its predecessors. Because currently the lock is implemented for the whole store, writing to one set of script will also block the read/write of other set of scripts. They should be safe to be updated concurrently because by design they are a fundamentally different set of scripts. If we can implement a lock per script set, then it should help with the performance.

Anyway, I think it looks good, although we'd probably need to monitor if this affects performance and adjust accordingly after - since tthis PR has no migration it should not be a problem to change the implementation in the future. At least for selector used in Onping, it won't be an issue as there's no lock there.

siddharth-krishna added 8 commits January 23, 2023 15:30

Some notes on fixing race conditions

5c9ed72

Try to fix races with an RWLock

7a2166c

Use MonadMask's bracket instead of unlift-io

1d0e228

Don't use locks in cached client

5bc3e9d

Add proper version bounds

fbe59df

Format

b74515c

Merge branch 'main' into sidk-fix-races

f178aa7

Cleanup

80bec96

siddharth-krishna commented Feb 27, 2023

View reviewed changes

siddharth-krishna marked this pull request as ready for review February 27, 2023 11:58

siddharth-krishna mentioned this pull request Feb 28, 2023

WIP: A Concurrency Tester for Inferno-VC (and other servant servers) #37

Open

shulhi reviewed Feb 28, 2023

View reviewed changes

shulhi approved these changes Feb 28, 2023

View reviewed changes

siddharth-krishna merged commit 981be9a into main Mar 1, 2023

siddharth-krishna deleted the sidk-fix-races branch March 1, 2023 04:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use RWLock to prevent races in inferno-vc-server #36

Use RWLock to prevent races in inferno-vc-server #36

siddharth-krishna commented Feb 27, 2023

siddharth-krishna Feb 27, 2023

shulhi Feb 28, 2023

shulhi commented Feb 28, 2023

Use RWLock to prevent races in inferno-vc-server #36

Use RWLock to prevent races in inferno-vc-server #36

Conversation

siddharth-krishna commented Feb 27, 2023

Inferno Version Control Store

Current Implementation

Issues with current implementation:

Option 0: slap a lock around all operations

Option 1: fix the current implementation

Option 2: use existing file-backed DB

Option 3: STM

Testing: are there libraries to test for such concurrency bugs?

siddharth-krishna Feb 27, 2023

Choose a reason for hiding this comment

shulhi Feb 28, 2023

Choose a reason for hiding this comment

shulhi commented Feb 28, 2023