-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introspection (part 3) #109
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
achille-roussel
approved these changes
Nov 18, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👏
Merged
chriso
added a commit
that referenced
this pull request
Nov 19, 2023
Following on from #109, this PR starts to break up the object graph into a form that's easier to work with. The serialization layer scans pointers in order to find the set of disjoint memory regions (the underlying structs/arrays). When serializing the object graph, the serialization layer would encode these regions inline. This PR updates the layer to store regions in separate protobuf messages, along with their type. As with the previous PRs (#108, #109), regions are stored in an array where the index is used as a unique identifier. When a pointer is encountered in the graph, we store an ID + offset pair to reference a region and a particular offset within that region. We also store the "root" region which is the memory backing whatever object was passed to `types.Serialize`. In the case of durable coroutine state, it'll be a [`*serializedCoroutine`](https://github.com/stealthrocket/coroutine/blob/main/coroutine_durable.go#L101) i.e. just a pointer to another region.
Merged
chriso
added a commit
that referenced
this pull request
Nov 23, 2023
This is the final PR in the introspection series (#107, #108, #109, #111, #112, #113), allowing users to scan objects within regions. Each region encodes a primitive value, a compound array or struct, or a map. The scanner scans the tree using a preorder traversal, providing type information, primitive values (where applicable), access to functions/regions when references are encountered, and various other type-specific helpers to access information about an object. The scanner does not recurse when references to other regions are encountered. The user is responsible for following pointers (if desired) and keeping track of regions visited to prevent infinite loops when cycles are present in the object graph. I fixed two inconsistencies in order to build the scanner: * the root region now has a type of `interface{}`, since that's what `types.Serialize` actually serializes. Encoding this type (rather than the unboxed type) means we don't need a special case for the root region when scanning * we now avoid double indirection when there's a reference to a map <details> <summary>Here's an example:</summary> ```go c, err := types.Inspect(b) if err != nil { panic(err) } regions := []*types.Region{c.Root()} for i := 0; i < c.NumRegion(); i++ { regions = append(regions, c.Region(i)) } for i, r := range regions { fmt.Println("Reading region", i, "with type", r.Type(), "and size", r.Size()) s := r.Scan() for s.Next() { fmt.Println("=> reading from offset", s.Pos()) if s.Custom() { fmt.Println("=> read an object serialized by a custom serializer") } switch s.Kind() { case reflect.Bool: fmt.Println("=> read bool", s.Bool()) case reflect.Int: fmt.Println("=> read int", s.Int()) case reflect.Int8: fmt.Println("=> read int", s.Int8()) case reflect.Int16: fmt.Println("=> read int16", s.Int16()) case reflect.Int32: fmt.Println("=> read int32", s.Int32()) case reflect.Int64: fmt.Println("=> read int64", s.Int64()) case reflect.Uint: fmt.Println("=> read uint", s.Uint()) case reflect.Uint8: fmt.Println("=> read uint8", s.Uint8()) case reflect.Uint16: fmt.Println("=> read uint16", s.Uint16()) case reflect.Uint32: fmt.Println("=> read uint32", s.Uint32()) case reflect.Uint64: fmt.Println("=> read uint64", s.Uint64()) case reflect.Uintptr: fmt.Println("=> read uintptr", s.Uintptr()) case reflect.Float32: fmt.Println("=> read float32", s.Float32()) case reflect.Float64: fmt.Println("=> read float64", s.Float64()) case reflect.Complex64: fmt.Println("=> read complex64", s.Complex64()) case reflect.Complex128: fmt.Println("=> read complex128", s.Complex128()) case reflect.Array: fmt.Println("=> read array of type", s.Type(), "with length", s.Len()) case reflect.Interface: if s.Nil() { if t := s.Type(); t != nil { fmt.Println("=> read interface of type", t, "with nil value") } else { fmt.Println("=> read nil interface") } } else if r, off := s.Region(); r != nil { fmt.Println("=> read interface of type", s.Type(), "pointing to region", r, "offset", off) } else { fmt.Println("=> read interface of type", s.Type(), "with static data", s.Uint64()) } case reflect.String: if s.Len() == 0 { fmt.Println("=> read string with length 0") } else { r, off := s.Region() fmt.Println("=> read string with length", s.Len(), "pointing to region", r, "offset", off) } case reflect.Slice: if r, off := s.Region(); r != nil { fmt.Println("=> read slice of type", s.Type(), "with length", s.Len(), "and cap", s.Cap(), "pointing to region", r, "offset", off) } else { fmt.Println("=> read nil slice of type", s.Type(), "with length", s.Len(), "and cap", s.Cap()) } case reflect.Func: if s.Nil() { fmt.Println("=> read nil function with type", s.Type()) } else if ct := s.Function().ClosureType(); ct != nil { fmt.Println("=> read function with type", s.Type(), "and closure layout", ct) } else { fmt.Println("=> read function with type", s.Type()) } case reflect.Struct: fmt.Println("=> read struct of type", s.Type()) case reflect.Pointer: if r, off := s.Region(); r != nil { fmt.Println("=> read pointer of type", s.Type(), "pointing to region", r, off) } else { fmt.Println("=> read nil pointer of type", s.Type()) } case reflect.UnsafePointer: if r, off := s.Region(); r != nil { fmt.Println("=> read unsafe pointer pointing to region", r, "offset", off) } else { fmt.Println("=> read nil unsafe pointer") } default: panic(fmt.Sprintf("not implemented: %s", s.Kind())) } } if err := s.Close(); err != nil { panic(err) } } ``` </details>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Following on from #107 and #108, this PR updates the serialization layer to store the set of functions along with associated type information.
Previously, the serialization layer would store a string identifier when a function pointer was encountered, followed by additional objects if the function was a closure. The deserialization layer relies on information from the user/coroc (via
RegisterClosure
) as to which functions are closures and what the layout of objects looks like in memory. For an offline process to analyze durable coroutine state we need to store this information alongside the state.This PR updates the serialization to register functions/closures (and types) as they're encountered in the object graph. Like types in #108, functions are stored in an array where the index is used as a unique identifier. When serializing functions/closures, the serialization layer now stores the function ID (the index, not the name) as function pointers are encountered, followed by closure objects. An offline process now has the same information as the original program when it comes to function types and closure memory layouts.