Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add FieldLiteral proto message #2752

Merged
merged 23 commits into from
Feb 13, 2024
Merged

Add FieldLiteral proto message #2752

merged 23 commits into from
Feb 13, 2024

Conversation

gilwong00
Copy link
Contributor

@gilwong00 gilwong00 commented Feb 5, 2024

Adds new FieldLiteral message in the doc.proto file.
These will allows will enable custom options(extensions) to be consumed from a proto file and return in the doc service used in core.

@gilwong00 gilwong00 requested review from jhump and bufdev and removed request for bufdev February 5, 2024 21:53
Copy link
Member

@jhump jhump left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the protosource changes should probably be in their own PR. That way you can go ahead and merge that without impacting the BSR's API, and then pull those changes into core.

That would allow you to then open a PR in core that has the implementation of the new service, concurrently with the service definition PR. Maybe that's not how we usually do things, but it seems like a nice property to be able to merge the service API change and the implementation as close together as possible. Otherwise, you may want a comment on the new API options fields that they are new and may not yet be populated in any API responses. 🤷

private/pkg/protosource/field.go Outdated Show resolved Hide resolved
private/pkg/protosource/protosource.go Outdated Show resolved Hide resolved
proto/buf/alpha/registry/v1alpha1/doc.proto Outdated Show resolved Hide resolved
proto/buf/alpha/registry/v1alpha1/doc.proto Outdated Show resolved Hide resolved
@@ -281,6 +285,8 @@ message EnumValue {
// Note that any leading and trailing `//` or spaces within a `/* */` block will be stripped.
string description = 3;
EnumValueOptions enum_value_options = 4;

repeated Option options = 5;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also need a field like this above in Enum (around line 270).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside: I just realized that we don't have a place to show file options in the current UI (which is yet another place where options show up).

I know this PR's focus is for being able to show custom options, but this same arc of work will allow us to also show "features" of an element (part of upcoming "Protobuf Editions"). And with those features, file options are relevant, since features can be defined at the file level and then apply to all elements defined in the file.

The best way to address this (once we get to a point where we want to render features for editions) will probably be some extra code in the server handler, to merge in file-level features for rendering on each element.

No action need... just thinking out loud (or "through the keyboard" perhaps)...

@@ -320,6 +326,7 @@ message Message {
MessageOptions message_options = 10;
// implicitly_deprecated is true if its enclosing file or parent element is deprecated.
bool implicitly_deprecated = 11;
repeated Option options = 12;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious that this contains no information about reserved ranges, reserved names, or extension ranges. Perhaps that was intentional? Or perhaps it was simply "out of scope" when the auto-generated docs were first implemented. (I wonder if it's appropriate to add a TODO about them?)

I mainly bring it up because an extension range is another thing that has options.

For now, no action needed.

proto/buf/alpha/registry/v1alpha1/doc.proto Outdated Show resolved Hide resolved
private/pkg/protosource/option_extension_descriptor.go Outdated Show resolved Hide resolved
private/pkg/protosource/option_extension_descriptor.go Outdated Show resolved Hide resolved
Comment on lines 201 to 205
ServiceOptions service_options = 8;
// implicitly_deprecated is true if its enclosing file is deprecated.
bool implicitly_deprecated = 9;

repeated Option options = 10;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that many of the other fields also don't have doc comments, including the service_options field. But it might be nice to add something to each of these to clarify their purpose, and also to move them so they're adjacent. Something like so:

Suggested change
ServiceOptions service_options = 8;
// implicitly_deprecated is true if its enclosing file is deprecated.
bool implicitly_deprecated = 9;
repeated Option options = 10;
// implicitly_deprecated is true if its enclosing file is deprecated.
bool implicitly_deprecated = 9;
// A select sub-set of service options.
ServiceOptions service_options = 8;
// All options that are present on the service. This is a super-set of
// service_options and uses a dynamic representation so it can also
// accommodate custom options with arbitrary types.
repeated Option options = 10;

proto/buf/alpha/registry/v1alpha1/doc.proto Outdated Show resolved Hide resolved
proto/buf/alpha/registry/v1alpha1/doc.proto Outdated Show resolved Hide resolved
proto/buf/alpha/registry/v1alpha1/doc.proto Outdated Show resolved Hide resolved
@gilwong00
Copy link
Contributor Author

I think the protosource changes should probably be in their own PR. That way you can go ahead and merge that without impacting the BSR's API, and then pull those changes into core.

That would allow you to then open a PR in core that has the implementation of the new service, concurrently with the service definition PR. Maybe that's not how we usually do things, but it seems like a nice property to be able to merge the service API change and the implementation as close together as possible. Otherwise, you may want a comment on the new API options fields that they are new and may not yet be populated in any API responses. 🤷

@jhump Good call out. I removed the usage of the FieldLiteral field in the messages I initially had them. This way we can still merge all the changes to the doc.proto without altering the service. I'll make a follow up PR to reimplement the usage of FieldLiteral.

@gilwong00 gilwong00 requested a review from jhump February 5, 2024 23:51
Copy link
Member

@jhump jhump left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Just a few more tiny things.

proto/buf/alpha/registry/v1alpha1/doc.proto Outdated Show resolved Hide resolved
proto/buf/alpha/registry/v1alpha1/doc.proto Outdated Show resolved Hide resolved
proto/buf/alpha/registry/v1alpha1/doc.proto Outdated Show resolved Hide resolved
@gilwong00 gilwong00 requested a review from jhump February 6, 2024 18:12
@@ -377,6 +378,47 @@ message FieldOptions {
int32 jstype = 4;
}

// FieldLiteral represents a field and its value. It can be used to model descriptor options.
message FieldLiteral {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't actually see this used anywhere, where it is used?

Also how was the API created/designed?

Copy link
Member

@jhump jhump Feb 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is to supplement the fields that look like FieldOptions field_options = 10 (across all of the descriptor types) with a field that instead looks like repeated FieldLiteral options = 11. This representation can model any option and is specifically intended for modeling custom options in a way that the frontend can render into the docs.

It's not used anywhere yet at my suggestion. As it is, this can be pulled into the other repo and the transformation from the protosource.OptionExtensionDescriptor model to the []*FieldLiteral model can be implemented and tested there w/out (yet) impacting the BSR API.

But I'm not super-familiar with the typical process of implementing BSR APIs. Is it common to update the API first and then leave this stuff in the API, but unimplemented, for however long it takes to actually implement? It seemed better to me to try to implement the server-side stuff alongside the API change and have them reviewed simultaneously (or concurrently?), to eliminate useless churn in the APIs (or introducing compatibility issues) if implementation reveals that changes are required.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm mostly interested how the specific messages were designed - are these net-new, or were they pulled from somewhere? If new-new, I need to review them more carefully. If not net-new, where did they derive from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are net new

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jhump is there not some pre-existing art that would suffice? Thoughts:

  • One of the well-known types.
  • Just using FieldOptions straight up.

@timostamm's work on extension and proto2 support with protobuf-es might change the answer as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using FieldOptions (or other option types) makes consuming the value much more complicated since it will have extensions. So a consumer would need to have the schemata for all custom options in order to interpret the data.

Basically same goes for well-known types -- the closest would be google.protobuf.Value, but it represents JSON data (so does not distinguish between any of the numeric types, does not support enums except as string or int values, and supports null which isn't appropriate here). Aside from that, the only thing that can dynamically represent any kind of message is google.protobuf.Any and that has the same drawbacks as extensions: the consumer must have the schemata for everything in order to interpret this.

This is kind of like google.protobuf.Value, except it is for modeling arbitrary Protobuf values instead of JSON values. It's intended to provide a very easy way to work with the data without the schema.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we could use the google.protobuf.*Options messages here, we would also have to ship the descriptors for the extensions and types they use. This only makes sense to me if we use descriptors for the entire package docs, replacing the current layer. This is feasible and has advantages, but it requires extra steps to split descriptor sets by package... and would obviously be a major pivot.

google.protobuf.Value cannot store the values NaN and Infinity for DOUBLE. It's unlikely that they appear in option values, but they can. It also cannot represent BYTES. This means that the data will contain strings with special meaning - base64 encoded BYTES, "NaN", "Infinity", and "-Infinity" - that the client cannot distinguish from other strings without knowing the schema, which leads to the same problem as with google.protobuf.*Options.

I agree with Josh here.

@gilwong00 gilwong00 requested a review from bufdev February 7, 2024 19:44
@bufdev
Copy link
Member

bufdev commented Feb 7, 2024

To shortchange the back and forth, I pushed up what I think this should be in 89c0eec. I updated the function name to ForEachPresentOption, and updated the docs to more closely match the style of PresentExtensionNumbers.

However, one issue - I think there's a bit of a discrepancy here in how we are referencing options vs extensions. The docs for PresentExtensionNumbers:

	// PresentExtensionNumbers returns field numbers for all options that
	// have a set value on this descriptor.

The (now updated) docs for ForEachPresentOption (which match the substance of what was there pre-my-edits):

	// ForEachPresentOption iterates through all options that have a set value on this
	// descriptor, invoking fn for each present option.

The discrepency being in one case, we're saying "extension numbers", and in the other, we're saying "options", for what (doc-wise) is obstensibly the same item. What should we do here? I think the answer is that the docs for PresentExtensionNumbers are incorrect, and need to be scoped (which matches what is going on in the function, namely a fieldDescriptor.IsExtension() call), but I'm not entirely sure.

A note: when this is merged, main will have to be merged into bufmod, and protosource has moved to bufprotosource. @gilwong00 you'll be on hook for this. https://github.com/bufbuild/buf/tree/bufmod/private/bufpkg/bufprotosource

@gilwong00 gilwong00 requested a review from jhump February 7, 2024 21:21
@jhump
Copy link
Member

jhump commented Feb 8, 2024

@bufdev, the discrepancy is intentional and kind of unavoidable because they do different things. PresentExtensionNumbers is just for extensions/custom options. This was used in combination with the other two methods I think, for getting source code info for custom options (like to do linting on protovalidate stuff I think?).

But the new method gets all options, not just extensions. This is because I think it will be best to treat all options uniformly in the generated docs, instead of having to special-case normal options vs custom options. (Especially with editions: special-casing the handling of the FeatureSet message doesn't make sense, and leaning on the same code that renders custom options makes more sense.)

@bufdev
Copy link
Member

bufdev commented Feb 8, 2024

That makes sense - can we update the comments on PresentExtensionNumbers to reflect this?

message LiteralValue {
oneof value {
string string_value = 2;
int64 int_value = 3;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this not contain all the primitive types? This API shouldn't be frontend-specific.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated this to include all the primitive types

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for representing values, not the schema.

Perhaps there might be a client that would want to distinguish int32 from int64 or double from float (though I can't think of any). But there's surely no need for fixed, sfixed, or sint in here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bufdev, is this API really not frontend specific? The comment on the service says otherwise:

// DocService defines a set of APIs that are intended for use by bufwebd only.

Either way, the API is undoubtedly specific to docs, and we ultimately want to present the option value as text. I cannot think of a use case where I would want to distinguish even DOUBLE from FLOAT in the context of docs, and I'd prefer if we use the widest types only. I don't think it's an issue to include all types here, just that I don't see the need.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, APIs like this (and APIs in general) really shouldn't be designed with a specific caller in mind, but in this specific instance I don't know how much the field difference matters.

gilwong00 added a commit that referenced this pull request Feb 9, 2024
Pulling changes to `protosource` from
#2752 into its own PR.
Comment on lines 425 to 426
ArrayLiteral array_value = 16;
MessageLiteral message_value = 17;
Copy link
Member

@timostamm timostamm Feb 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repeated fields can be specified separately or joined. The following two options are identical:

option (google.api.http) = {
  get: "/v1/{name=messages/*}"
  additional_bindings {
    get: "xxx"
  }
  additional_bindings {
    get: "yyy"
  }
};
option (google.api.http) = {
  get: "/v1/{name=messages/*}"
  additional_bindings: [
    {
      get: "xxx"
    },
    {
      get: "yyy"
    }
  ]
};

I think it would be confusing to users if they use one form, but the docs show a different form. If I'm not mistaken, we can retain the form with this model. Just commenting here as a 👍.

(I'm assuming we expect clients to render options as proto source.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will be okay to render them uniformly instead of trying to re-create the source. And I'd suggest always doing the second form (array-literal style syntax, which will be more intuitive for most readers).

For one, I think it will look nicer if the APIs are presented consistently in documentation, instead of trying to preserve possibly idiosyncratic syntax in some sources. And two, re-creating the original source from just the descriptor (i.e. trying to re-construct formatting and such from just the source code info) is not really feasible and trying to merge actual sources with the descriptor into something that is more faithful to the original source would also be a real bear to implement.

proto/buf/alpha/registry/v1alpha1/doc.proto Outdated Show resolved Hide resolved
Comment on lines +396 to +398
// Map of field names to value. Extension field names will be in the form:
// [fully.qualified.extension.Name].
repeated FieldLiteral fields = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose Any will have a "synthetic" field with the type URL here?

https://protobuf.com/docs/language-spec#any-messages

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch!

message LiteralValue {
oneof value {
string string_value = 2;
int64 int_value = 3;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bufdev, is this API really not frontend specific? The comment on the service says otherwise:

// DocService defines a set of APIs that are intended for use by bufwebd only.

Either way, the API is undoubtedly specific to docs, and we ultimately want to present the option value as text. I cannot think of a use case where I would want to distinguish even DOUBLE from FLOAT in the context of docs, and I'd prefer if we use the widest types only. I don't think it's an issue to include all types here, just that I don't see the need.

Copy link
Member

@timostamm timostamm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO FieldLiteral is appropriate, although we should be aware that we cannot retain some details of the original form regarding formatting, numeric notation, and comments inside the option value (practical example).

I also want to point out that by modeling option values agnostically, clients that want to render them in the protobuf text format have to implement formatting for non-infinite float values and proper escaping of string and bytes. An opinionated alternative would be to provide all values as already formatted text.

Comment on lines +383 to +387
// When is_extension is true, name is the fully-qualified name of the
// extension. Otherwise, it is the simple name of the field.
string name = 1;
// The unique field number associated with the field.
int32 tag = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that the fully qualified name of the extension is sufficient here.

For a practical example:

syntax = "proto3";
import "buf/validate/validate.proto";
message User {
  string username = 1 [(buf.validate.field).string = {min_len: 3}];
}

I assume we'd show the message in the docs as it appears in the source (although probably with additional line breaks), and link the option (at least the part buf.validate.field) to https://buf.build/bufbuild/protovalidate/docs/main:buf.validate#extensions.

For that to work across modules, we'll need a ImportModuleRef (assuming we want to follow the existing pattern).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good point. For hyper-linking to the docs for an extension (or message type in an Any), the module ref will be needed.

uint64 uint_value = 4;
double double_value = 5;
bool bool_value = 6;
string enum_value = 7;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this string?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We going to be showing this in docs so the name should be more useful than the number value to users.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jhump should this semantically be ie enum_value_name_value, if that makes sense? It's not an enum, its an enum value, and its the name of the enum value...I don't know.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems like a lot of ceremony. It represents an enum value, so seems like a doc comment clarifying that this is the enum value name feels like it suffices. Another option is to extract out another type -- like maybe an EnumLiteral that has both a name and a number. But, honestly, that felts a little overkill 🤷

Copy link
Member

@bufdev bufdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, for long-term maintainability here, I wish we could lean less into this as a frontend-specific API, and put more of the computation for docs into the FE going forward, i.e. I wish all that the generated documentation needed to do was query the reflection endpoint, get a FileDescriptorSet, and the frontend uses protobuf-es to do the rest. I recognize that's a major change and not in scope, so approving, but I do wonder if we're adding more pain for us down the road.

uint64 uint_value = 4;
double double_value = 5;
bool bool_value = 6;
string enum_value = 7;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jhump should this semantically be ie enum_value_name_value, if that makes sense? It's not an enum, its an enum value, and its the name of the enum value...I don't know.

@gilwong00 gilwong00 merged commit e3c2810 into main Feb 13, 2024
11 checks passed
@gilwong00 gilwong00 deleted the gwong/custom-options branch February 13, 2024 00:01
@gilwong00 gilwong00 changed the title Add options to doc service Add FieldLiteral proto message Feb 13, 2024
gilwong00 added a commit that referenced this pull request Feb 16, 2024
Part 2 of #2752. 

Adding `FieldLiteral` to `Message`, `Enum`, `Service`, `Method` and
`EnumValue` responses
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants