Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

experimental: configure custom memory allocator #2177

Merged
merged 11 commits into from
Apr 10, 2024
Merged

experimental: configure custom memory allocator #2177

merged 11 commits into from
Apr 10, 2024

Conversation

ncruces
Copy link
Collaborator

@ncruces ncruces commented Apr 6, 2024

Still draft, missing tests, but you can see it put to good use in ncruces/go-sqlite3#71.
It passes extensive tests on Linux (amd64 native, arm64 emulated) and macOS (amd64/arm64 native), and solves a long standing issue with SQLite (on those platforms).

Happy to redesign the API, putting the PR up precisely to collect early feedback!

experimental/memory.go Outdated Show resolved Hide resolved
internal/wasm/memory.go Outdated Show resolved Hide resolved
ncruces added 5 commits April 6, 2024 13:12
Signed-off-by: Nuno Cruces <[email protected]>
Signed-off-by: Nuno Cruces <[email protected]>
Signed-off-by: Nuno Cruces <[email protected]>
Signed-off-by: Nuno Cruces <[email protected]>
@ncruces ncruces marked this pull request as ready for review April 6, 2024 12:15
@ncruces ncruces requested a review from evacchi as a code owner April 6, 2024 12:15
@ncruces ncruces marked this pull request as draft April 6, 2024 12:15
experimental/memory.go Outdated Show resolved Hide resolved
ncruces added 3 commits April 9, 2024 12:16
Signed-off-by: Nuno Cruces <[email protected]>
Signed-off-by: Nuno Cruces <[email protected]>
Signed-off-by: Nuno Cruces <[email protected]>
@ncruces
Copy link
Collaborator Author

ncruces commented Apr 9, 2024

Example implementation of the API:
https://github.com/ncruces/go-sqlite3/blob/wal/internal/util/alloc.go

internal/wasm/memory.go Outdated Show resolved Hide resolved
@ncruces ncruces marked this pull request as ready for review April 9, 2024 19:39
internal/wasm/memory.go Outdated Show resolved Hide resolved
internal/wasm/memory.go Outdated Show resolved Hide resolved
internal/wasm/memory.go Outdated Show resolved Hide resolved
Comment on lines +366 to +369
var allocator experimental.MemoryAllocator
if ctx != nil {
allocator, _ = ctx.Value(ctxkey.MemoryAllocatorKey{}).(experimental.MemoryAllocator)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't check ctx != nil elsewhere, so please simplify as

Suggested change
var allocator experimental.MemoryAllocator
if ctx != nil {
allocator, _ = ctx.Value(ctxkey.MemoryAllocatorKey{}).(experimental.MemoryAllocator)
}
allocator := ctx.Value(ctxkey.MemoryAllocatorKey{}).(experimental.MemoryAllocator)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, there's a specific test case covering this:

for _, ctx := range []context.Context{nil, testCtx} { // Ensure it doesn't crash on nil!
moduleName := t.Name()
m, err := s.Instantiate(ctx, &Module{}, moduleName, nil, nil)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nil contexts should not be common in Go. There's even a lint to use context.Background() instead of passing nil because a lot of places assume a valid context. It may be reasonable to change this test.

@mathetake
Copy link
Member

left some comments, but I think generally this looks like I was expecting to see. Only thing left it naming nits and unit tests.

ps I think this closes #1331

Signed-off-by: Nuno Cruces <[email protected]>
Copy link
Contributor

@anuraaga anuraaga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides other comments this LGTM. Thanks!

experimental/memory.go Show resolved Hide resolved
ncruces added 2 commits April 10, 2024 10:34
Signed-off-by: Nuno Cruces <[email protected]>
Signed-off-by: Nuno Cruces <[email protected]>
@ncruces
Copy link
Collaborator Author

ncruces commented Apr 10, 2024

ps I think this closes #1331

It also closes #1150, IMO.

@mathetake mathetake merged commit a0fbb18 into tetratelabs:main Apr 10, 2024
64 checks passed
This was referenced Apr 10, 2024
Comment on lines +14 to +26
type MemoryAllocator func(min, cap, max uint64) MemoryBuffer

// MemoryBuffer is a memory buffer that backs a Wasm memory.
type MemoryBuffer interface {
// Return the backing []byte for the memory buffer.
Buffer() []byte
// Grow the backing memory buffer to size bytes in length.
// To back a shared memory, Grow can't change the address
// of the backing []byte (only its length/capacity may change).
Grow(size uint64) []byte
// Free the backing memory buffer.
Free()
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies because I'm a little late to the party, but since this is still in the experimental package it's maybe still OK to leave feedback; a few suggestions:

  • Turn the memory allocator into an interface since it's intended to be an abstraction; single-method interfaces are a very common idiom in Go that developers are familiar with:
type MemoryAllocator interface {
  Allocate(min, cap, max uint64) MemoryBuffer
}

// We can always add a helper to implement the interface on function values.
type MemoryAllocatorFunc func(min, cap, max uint64) MemoryBuffer
  • The use of 3 uint64 as argument makes the signature error prone to programs that would mistakenly swap the position of these arguments. For example, my inuition would be for min/max to be back-to-back, I will for sure make this mistake at some point and the Go compiler cannot detect those kind of errors. One way to reduce this risk would be to introduce a configuration struct, which forces named arguments:
type MemoryConfig struct {
  _ struct{} // forces named arguments
  Min uint64
  Max uint64
  Cap uint64
}
// Allocation must always explicilty name parameters:
alloc(MemoryConfig{
  Max: 64 * 1024 * 1024,
})
  • The MemoryBuffer type sound more like an region or linear memory, we have an opportunity to improve the name to be less generic I think ("memory buffer" can describe about anything in a program), for example:
type LinearMemory interface {
  ...
}
type MemoryArena interface {
   ...
}

etc...

Copy link
Collaborator Author

@ncruces ncruces Apr 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I'm:
🤷 on the single method interface
👎 on the args struct (this is called in one place inside wazero, users implement it, don't call it)
👍 on renaming MemoryBuffer

But I'll do whatever. I'm just ecstatic that I was able to mmap files and do WAL after one year. 🥳

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah OK, your argument on the arg struct makes sense, if users won't be calling it directly the risk is low 👍

Comment on lines +20 to +22
// Grow the backing memory buffer to size bytes in length.
// To back a shared memory, Grow can't change the address
// of the backing []byte (only its length/capacity may change).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this requirement make it impossible to implement a version of this interface backed by the Go heap?

It seems like the sliceBuffer implementation using append would violate this constraint.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No:

  1. this requirement is only for shared memories;
  2. you're allowed to pre-allocate the max to avoid relocations if you need such behavior.

Note that my mmap implementation also reserves the max address space, because it depends on memory never moving.

PS: I can make this into a package (and, I guess, support Windows) if there's interest. It has interesting benefits. Makes startup/teardown slower, but can make growing memory faster, as it avoids copies. And with the trick of allocating 8GB, we could probably disable bounds checks altogether.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API as-is is quite easy to misuse, the interface implementation has no way of knowing if it's being called to allocated a shared memory segment or not, and the user may also not know if the underlying memory is shared-ready or not.

In my experience, this calls for separating the concepts, maybe we can even get compile-time checks to ensure that the right kind of memory is being used:

type MemoryAllocator interface {
  AllocateLinearMemory(...) LinearMemory
  AllocateSharedMemory(...) SharedMemory
}

// It could be useful for these two interfaces to share a common set of methods,
// but also differentiate so the compiler can guarantee that they're never misused.

type LinearMemory interface {
  ...
}

type SharedMemory interface {
  ...
}

Copy link
Collaborator Author

@ncruces ncruces Apr 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, point taken on “the allocator doesn't know if the memory is shared.”

I thought of adding a parameter, but then decided on just documenting. An implementor can and should specify if memories move when grown. Mine don't regardless of sharing.

The allocator is configured at module instantiation, so you should know by then if you need shared memory. Same as I know I need a memory that I can mmap but this is not compile time checked.

Also with Go duck-typing your suggested API only differentiates the “kind” of memory if we add a dummy method.

Lastly, note that shared memories can still grow, they just can't move when grown. So the API is much the same.

With a large virtual address space it's perfectly feasible to reserve N×4GB of address space without committing any of it to memory, all on a 2GB “droplet" with no swap, and then organically decide at runtime which of the N gets 512MB while the other N-1 stay under 64MB.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought of adding a parameter, but then decided on just documenting. An implementor can and should specify if memories move when grown. Mine don't regardless of sharing.

The allocator is configured at module instantiation, so you should know by then if you need shared memory. Same as I know I need a memory that I can mmap but this is not compile time checked.

That's a bit more reliance on developers "getting it right" that I'm usually comfortable with; assumptions change over time especially in an evolving world like WebAssembly, I'd err on the side of caution and figure out how to leverage the compiler and API to be more future-proof. That being said, that's not a hill I'm willing to die on either, things working is better than things being perfect, I'm just trying to emphasize that most developers aren't Nuno and will likely get things wrong if the API allows for it ;)

Also with Go duck-typing your suggested API only differentiates the “kind” of memory if we add a dummy method.

I don't know if I'm suggesting differentiating via a dummy method, I'll have to think more about it, maybe the differences between linear and shared are more profound and could be expressed in the API.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Future Nuno will thank you for it!
I'm all for expressing stuff with the type system, as long as it doesn't become clunky.

Comment on lines +18 to +19
// Return the backing []byte for the memory buffer.
Buffer() []byte
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Food for thoughts, this method could potentially be removed in favor of Grow(0).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants