Introspection (part 3) #109

chriso · 2023-11-17T23:04:42Z

Following on from #107 and #108, this PR updates the serialization layer to store the set of functions along with associated type information.

Previously, the serialization layer would store a string identifier when a function pointer was encountered, followed by additional objects if the function was a closure. The deserialization layer relies on information from the user/coroc (via RegisterClosure) as to which functions are closures and what the layout of objects looks like in memory. For an offline process to analyze durable coroutine state we need to store this information alongside the state.

This PR updates the serialization to register functions/closures (and types) as they're encountered in the object graph. Like types in #108, functions are stored in an array where the index is used as a unique identifier. When serializing functions/closures, the serialization layer now stores the function ID (the index, not the name) as function pointers are encountered, followed by closure objects. An offline process now has the same information as the original program when it comes to function types and closure memory layouts.

…t graph

achille-roussel

👏

Following on from #109, this PR starts to break up the object graph into a form that's easier to work with. The serialization layer scans pointers in order to find the set of disjoint memory regions (the underlying structs/arrays). When serializing the object graph, the serialization layer would encode these regions inline. This PR updates the layer to store regions in separate protobuf messages, along with their type. As with the previous PRs (#108, #109), regions are stored in an array where the index is used as a unique identifier. When a pointer is encountered in the graph, we store an ID + offset pair to reference a region and a particular offset within that region. We also store the "root" region which is the memory backing whatever object was passed to `types.Serialize`. In the case of durable coroutine state, it'll be a [`*serializedCoroutine`](https://github.com/stealthrocket/coroutine/blob/main/coroutine_durable.go#L101) i.e. just a pointer to another region.

This is the final PR in the introspection series (#107, #108, #109, #111, #112, #113), allowing users to scan objects within regions. Each region encodes a primitive value, a compound array or struct, or a map. The scanner scans the tree using a preorder traversal, providing type information, primitive values (where applicable), access to functions/regions when references are encountered, and various other type-specific helpers to access information about an object. The scanner does not recurse when references to other regions are encountered. The user is responsible for following pointers (if desired) and keeping track of regions visited to prevent infinite loops when cycles are present in the object graph. I fixed two inconsistencies in order to build the scanner: * the root region now has a type of `interface{}`, since that's what `types.Serialize` actually serializes. Encoding this type (rather than the unboxed type) means we don't need a special case for the root region when scanning * we now avoid double indirection when there's a reference to a map <details> <summary>Here's an example:</summary> ```go c, err := types.Inspect(b) if err != nil { panic(err) } regions := []*types.Region{c.Root()} for i := 0; i < c.NumRegion(); i++ { regions = append(regions, c.Region(i)) } for i, r := range regions { fmt.Println("Reading region", i, "with type", r.Type(), "and size", r.Size()) s := r.Scan() for s.Next() { fmt.Println("=> reading from offset", s.Pos()) if s.Custom() { fmt.Println("=> read an object serialized by a custom serializer") } switch s.Kind() { case reflect.Bool: fmt.Println("=> read bool", s.Bool()) case reflect.Int: fmt.Println("=> read int", s.Int()) case reflect.Int8: fmt.Println("=> read int", s.Int8()) case reflect.Int16: fmt.Println("=> read int16", s.Int16()) case reflect.Int32: fmt.Println("=> read int32", s.Int32()) case reflect.Int64: fmt.Println("=> read int64", s.Int64()) case reflect.Uint: fmt.Println("=> read uint", s.Uint()) case reflect.Uint8: fmt.Println("=> read uint8", s.Uint8()) case reflect.Uint16: fmt.Println("=> read uint16", s.Uint16()) case reflect.Uint32: fmt.Println("=> read uint32", s.Uint32()) case reflect.Uint64: fmt.Println("=> read uint64", s.Uint64()) case reflect.Uintptr: fmt.Println("=> read uintptr", s.Uintptr()) case reflect.Float32: fmt.Println("=> read float32", s.Float32()) case reflect.Float64: fmt.Println("=> read float64", s.Float64()) case reflect.Complex64: fmt.Println("=> read complex64", s.Complex64()) case reflect.Complex128: fmt.Println("=> read complex128", s.Complex128()) case reflect.Array: fmt.Println("=> read array of type", s.Type(), "with length", s.Len()) case reflect.Interface: if s.Nil() { if t := s.Type(); t != nil { fmt.Println("=> read interface of type", t, "with nil value") } else { fmt.Println("=> read nil interface") } } else if r, off := s.Region(); r != nil { fmt.Println("=> read interface of type", s.Type(), "pointing to region", r, "offset", off) } else { fmt.Println("=> read interface of type", s.Type(), "with static data", s.Uint64()) } case reflect.String: if s.Len() == 0 { fmt.Println("=> read string with length 0") } else { r, off := s.Region() fmt.Println("=> read string with length", s.Len(), "pointing to region", r, "offset", off) } case reflect.Slice: if r, off := s.Region(); r != nil { fmt.Println("=> read slice of type", s.Type(), "with length", s.Len(), "and cap", s.Cap(), "pointing to region", r, "offset", off) } else { fmt.Println("=> read nil slice of type", s.Type(), "with length", s.Len(), "and cap", s.Cap()) } case reflect.Func: if s.Nil() { fmt.Println("=> read nil function with type", s.Type()) } else if ct := s.Function().ClosureType(); ct != nil { fmt.Println("=> read function with type", s.Type(), "and closure layout", ct) } else { fmt.Println("=> read function with type", s.Type()) } case reflect.Struct: fmt.Println("=> read struct of type", s.Type()) case reflect.Pointer: if r, off := s.Region(); r != nil { fmt.Println("=> read pointer of type", s.Type(), "pointing to region", r, off) } else { fmt.Println("=> read nil pointer of type", s.Type()) } case reflect.UnsafePointer: if r, off := s.Region(); r != nil { fmt.Println("=> read unsafe pointer pointing to region", r, "offset", off) } else { fmt.Println("=> read nil unsafe pointer") } default: panic(fmt.Sprintf("not implemented: %s", s.Kind())) } } if err := s.Close(); err != nil { panic(err) } } ``` </details>

chriso added 3 commits November 18, 2023 08:26

Add a Function message

51fddcb

Store the set of functions, along with type info, separate from objec…

9776dd2

…t graph

Make the casting/indirection clearer

b819413

chriso changed the title ~~Proto3~~ Introspection (part 3) Nov 17, 2023

achille-roussel approved these changes Nov 18, 2023

View reviewed changes

chriso merged commit 5760ada into main Nov 18, 2023
2 checks passed

chriso deleted the proto3 branch November 18, 2023 00:52

chriso mentioned this pull request Nov 18, 2023

Introspection (part 4) #111

Merged

chriso mentioned this pull request Nov 23, 2023

Introspection (part 7) #116

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introspection (part 3) #109

Introspection (part 3) #109

chriso commented Nov 17, 2023 •

edited

Loading

achille-roussel left a comment

Introspection (part 3) #109

Introspection (part 3) #109

Conversation

chriso commented Nov 17, 2023 • edited Loading

achille-roussel left a comment

Choose a reason for hiding this comment

chriso commented Nov 17, 2023 •

edited

Loading