Implement/use a byte slice pool #254

kcalvinalvin · 2021-02-17T17:05:46Z

There are functions in these packages that allocate and de-allocate byte
slices numerous times per function call. Using a pool for byte slices
help reduce this allocation and speeds up the code.

Also adds helper functions for serializing and de-serializing ints to byte slices for the byte slices returned from the pool as functions such as binary.BigEndian.PutUint64 binary.BigEndian.Uint64 allocate bytes inside the functions.

Also added comments for types/functions that didn't have any

Adds a new pool for bytes to help with gc slowdowns. Common package is needed to avoid import cycles when putting freeBytes into util package

There are functions in these packages that allocate and de-allocate byte slices numerous times per function call. Using a pool for byte slices help reduce this allocation and speeds up the code.

adiabat · 2021-02-22T20:35:44Z

Hey - thanks for working on this but as we talked about on the call, I'm pretty hesitant to use this. It's not actually that complex, but it is kind of weird, and people coming to the code for the first time would be like "why is it doing this weird thing". Or people working on code / modifying it later may not use it as it's not obvious that they need to.

It seems like most of the code that gets sped-up by this is just dumb code that we should actually fix. ParentHash is one that allocates too much - it seems like we could switch to using sha512.New512_256() instead of sha512.Sum512_256 to avoid the append and new allocation there. Like this:

func parentHash(l, r Hash) (h Hash) {
	s52 := sha512.New512_256()
	s52.Write(l[:])
	s52.Write(r[:])
	copy(h[:], s52.Sum(nil))
	return
}

That probably eliminates the extra allocation. Though we can probably do even better...

(I still think some kind of hash worker goroutine that then gets .Reset() to avoid creating / destroying the hasher, which is probably a bigger allocation issue)

The other problems mentioned were SerializeSize() for UData which.. yeah is horrible because it just serializes the whole thing and then checks how big it was. The pool would help with this but really the whole function needs to be fixed.

Similarly leaf hashes don't track their own hash, but they could, just like btcutil txs do.

So it makes sense that a slice pool will help because there's a bunch of dumb / slow code here, but I think the first thing we should do is fix the dumb / slow code, which might end up faster than the pool.

Once we fix these things, if it turns out the pool still makes a significant help, then I'd be OK using one.

If there are other slow / CG or allocation heavy areas, please let me know / write them here and I'll take a look.

adiabat · 2021-02-22T20:40:30Z

Hm yeah just thinking... even without threads, keeping everything blocking, the ParentHash function could become a method on forest or pollard, and both those structs just have their own hash.Hash which gets used for that method. The creation of a new SHA512_256 every time is probably slower than calling Reset() on one that's persistent.

Also there's all sorts of hardware optimized hashing that I don't think we're using right now.

kcalvinalvin changed the title ~~accumulator/btcacc: Use sync.Pool for byte slices in package common~~ Implement/use a byte slice pool Feb 17, 2021

common: Add freeBytes pool in a new package 'common'

8dbebcd

Adds a new pool for bytes to help with gc slowdowns. Common package is needed to avoid import cycles when putting freeBytes into util package

kcalvinalvin force-pushed the add-byteslice-pool branch from f43aba3 to 6bdb889 Compare February 17, 2021 17:07

kcalvinalvin marked this pull request as draft February 18, 2021 04:49

kcalvinalvin marked this pull request as ready for review February 18, 2021 06:02

kcalvinalvin marked this pull request as draft February 18, 2021 06:26

kcalvinalvin added 2 commits February 18, 2021 16:04

accumulator/btcacc: Use sync.Pool for byte slices in package common

d1001ad

There are functions in these packages that allocate and de-allocate byte slices numerous times per function call. Using a pool for byte slices help reduce this allocation and speeds up the code.

btcacc/leaf_test: Add test for serializing and de-serializing leafdata

4d73eb7

kcalvinalvin force-pushed the add-byteslice-pool branch from a7ed94d to 4d73eb7 Compare February 18, 2021 07:05

kcalvinalvin marked this pull request as ready for review February 18, 2021 07:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement/use a byte slice pool #254

Implement/use a byte slice pool #254

kcalvinalvin commented Feb 17, 2021 •

edited

Loading

adiabat commented Feb 22, 2021

adiabat commented Feb 22, 2021

Implement/use a byte slice pool #254

Are you sure you want to change the base?

Implement/use a byte slice pool #254

Conversation

kcalvinalvin commented Feb 17, 2021 • edited Loading

adiabat commented Feb 22, 2021

adiabat commented Feb 22, 2021

kcalvinalvin commented Feb 17, 2021 •

edited

Loading