Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introspection (part 3) #109

Merged
merged 3 commits into from
Nov 18, 2023
Merged

Introspection (part 3) #109

merged 3 commits into from
Nov 18, 2023

Conversation

chriso
Copy link
Contributor

@chriso chriso commented Nov 17, 2023

Following on from #107 and #108, this PR updates the serialization layer to store the set of functions along with associated type information.

Previously, the serialization layer would store a string identifier when a function pointer was encountered, followed by additional objects if the function was a closure. The deserialization layer relies on information from the user/coroc (via RegisterClosure) as to which functions are closures and what the layout of objects looks like in memory. For an offline process to analyze durable coroutine state we need to store this information alongside the state.

This PR updates the serialization to register functions/closures (and types) as they're encountered in the object graph. Like types in #108, functions are stored in an array where the index is used as a unique identifier. When serializing functions/closures, the serialization layer now stores the function ID (the index, not the name) as function pointers are encountered, followed by closure objects. An offline process now has the same information as the original program when it comes to function types and closure memory layouts.

@chriso chriso changed the title Proto3 Introspection (part 3) Nov 17, 2023
Copy link
Contributor

@achille-roussel achille-roussel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏

@chriso chriso merged commit 5760ada into main Nov 18, 2023
2 checks passed
@chriso chriso deleted the proto3 branch November 18, 2023 00:52
@chriso chriso mentioned this pull request Nov 18, 2023
chriso added a commit that referenced this pull request Nov 19, 2023
Following on from #109,
this PR starts to break up the object graph into a form that's easier to
work with.

The serialization layer scans pointers in order to find the set of
disjoint memory regions (the underlying structs/arrays). When
serializing the object graph, the serialization layer would encode these
regions inline. This PR updates the layer to store regions in separate
protobuf messages, along with their type. As with the previous PRs
(#108, #109), regions are stored in an array where the index is used as
a unique identifier. When a pointer is encountered in the graph, we
store an ID + offset pair to reference a region and a particular offset
within that region. We also store the "root" region which is the memory
backing whatever object was passed to `types.Serialize`. In the case of
durable coroutine state, it'll be a
[`*serializedCoroutine`](https://github.com/stealthrocket/coroutine/blob/main/coroutine_durable.go#L101)
i.e. just a pointer to another region.
@chriso chriso mentioned this pull request Nov 23, 2023
chriso added a commit that referenced this pull request Nov 23, 2023
This is the final PR in the introspection series (#107, #108, #109,
#111, #112, #113), allowing users to scan objects within regions.

Each region encodes a primitive value, a compound array or struct, or a
map. The scanner scans the tree using a preorder traversal, providing
type information, primitive values (where applicable), access to
functions/regions when references are encountered, and various other
type-specific helpers to access information about an object. The scanner
does not recurse when references to other regions are encountered. The
user is responsible for following pointers (if desired) and keeping
track of regions visited to prevent infinite loops when cycles are
present in the object graph.

I fixed two inconsistencies in order to build the scanner:
* the root region now has a type of `interface{}`, since that's what
`types.Serialize` actually serializes. Encoding this type (rather than
the unboxed type) means we don't need a special case for the root region
when scanning
* we now avoid double indirection when there's a reference to a map

<details>
<summary>Here's an example:</summary>

```go
c, err := types.Inspect(b)
if err != nil {
	panic(err)
}

regions := []*types.Region{c.Root()}
for i := 0; i < c.NumRegion(); i++ {
	regions = append(regions, c.Region(i))
}

for i, r := range regions {
	fmt.Println("Reading region", i, "with type", r.Type(), "and size", r.Size())

	s := r.Scan()
	for s.Next() {
		fmt.Println("=> reading from offset", s.Pos())
		if s.Custom() {
			fmt.Println("=> read an object serialized by a custom serializer")
		}

		switch s.Kind() {
		case reflect.Bool:
			fmt.Println("=> read bool", s.Bool())
		case reflect.Int:
			fmt.Println("=> read int", s.Int())
		case reflect.Int8:
			fmt.Println("=> read int", s.Int8())
		case reflect.Int16:
			fmt.Println("=> read int16", s.Int16())
		case reflect.Int32:
			fmt.Println("=> read int32", s.Int32())
		case reflect.Int64:
			fmt.Println("=> read int64", s.Int64())
		case reflect.Uint:
			fmt.Println("=> read uint", s.Uint())
		case reflect.Uint8:
			fmt.Println("=> read uint8", s.Uint8())
		case reflect.Uint16:
			fmt.Println("=> read uint16", s.Uint16())
		case reflect.Uint32:
			fmt.Println("=> read uint32", s.Uint32())
		case reflect.Uint64:
			fmt.Println("=> read uint64", s.Uint64())
		case reflect.Uintptr:
			fmt.Println("=> read uintptr", s.Uintptr())
		case reflect.Float32:
			fmt.Println("=> read float32", s.Float32())
		case reflect.Float64:
			fmt.Println("=> read float64", s.Float64())
		case reflect.Complex64:
			fmt.Println("=> read complex64", s.Complex64())
		case reflect.Complex128:
			fmt.Println("=> read complex128", s.Complex128())

		case reflect.Array:
			fmt.Println("=> read array of type", s.Type(), "with length", s.Len())

		case reflect.Interface:
			if s.Nil() {
				if t := s.Type(); t != nil {
					fmt.Println("=> read interface of type", t, "with nil value")
				} else {
					fmt.Println("=> read nil interface")
				}
			} else if r, off := s.Region(); r != nil {
				fmt.Println("=> read interface of type", s.Type(), "pointing to region", r, "offset", off)
			} else {
				fmt.Println("=> read interface of type", s.Type(), "with static data", s.Uint64())
			}

		case reflect.String:
			if s.Len() == 0 {
				fmt.Println("=> read string with length 0")
			} else {
				r, off := s.Region()
				fmt.Println("=> read string with length", s.Len(), "pointing to region", r, "offset", off)
			}

		case reflect.Slice:
			if r, off := s.Region(); r != nil {
				fmt.Println("=> read slice of type", s.Type(), "with length", s.Len(), "and cap", s.Cap(), "pointing to region", r, "offset", off)
			} else {
				fmt.Println("=> read nil slice of type", s.Type(), "with length", s.Len(), "and cap", s.Cap())
			}

		case reflect.Func:
			if s.Nil() {
				fmt.Println("=> read nil function with type", s.Type())
			} else if ct := s.Function().ClosureType(); ct != nil {
				fmt.Println("=> read function with type", s.Type(), "and closure layout", ct)
			} else {
				fmt.Println("=> read function with type", s.Type())
			}

		case reflect.Struct:
			fmt.Println("=> read struct of type", s.Type())

		case reflect.Pointer:
			if r, off := s.Region(); r != nil {
				fmt.Println("=> read pointer of type", s.Type(), "pointing to region", r, off)
			} else {
				fmt.Println("=> read nil pointer of type", s.Type())
			}

		case reflect.UnsafePointer:
			if r, off := s.Region(); r != nil {
				fmt.Println("=> read unsafe pointer pointing to region", r, "offset", off)
			} else {
				fmt.Println("=> read nil unsafe pointer")
			}

		default:
			panic(fmt.Sprintf("not implemented: %s", s.Kind()))
		}
	}
	if err := s.Close(); err != nil {
		panic(err)
	}
}
```

</details>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants