-
Notifications
You must be signed in to change notification settings - Fork 889
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Send raw history blobs from history service to frontend #7179
base: main
Are you sure you want to change the base?
Conversation
…istory API from history service
66cd80b
to
138b056
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it's draft but I just started looking and had a few comments
switch data.EncodingType { | ||
case enumspb.ENCODING_TYPE_PROTO3: | ||
err = proto.UnmarshalOptions{ | ||
DiscardUnknown: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add a comment here that this option is key to the performance gains
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -65,3 +66,28 @@ message TaskRange { | |||
TaskKey inclusive_min_task_key = 1; | |||
TaskKey exclusive_max_task_key = 2; | |||
} | |||
|
|||
message StrippedHistoryEvent { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment that this is a subset of HistoryEvent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
message GetWorkflowExecutionHistoryResponse { | ||
repeated bytes history = 1; | ||
// Raw history is an alternate representation of history that may be returned if configured on | ||
// the frontend. This is not supported by all SDKs. Either this or `history` will be set. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove the comment about supported by SDKs? that's not relevant here, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah right. Removed it.
@@ -981,6 +981,10 @@ message GetWorkflowExecutionHistoryResponse { | |||
temporal.api.workflowservice.v1.GetWorkflowExecutionHistoryResponse response = 1; | |||
} | |||
|
|||
message GetWorkflowExecutionHistoryResponseWithRaw { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a comment that this is wire-compatible with GetWorkflowExecutionHistoryResponse?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
What changed?
Change to send raw history blobs from history service to frontend service. History service returns a new proto message that has a repeated bytes history field.
This response is wire compatible with the original response which has temporal.api.history.v1.History type for this field. This allows history service to not deserialize events from this data blob. This considerably reduces CPU usage.
History service still needs event_id and version decoded from history events. For this we use a new proto message StrippedHistoryEvent which has these two fields only. It takes considerably less CPU to decode events to this struct.
Why?
We have seen incidents of high history CPU usage when large number of GetWorkflowExecutionHistory calls are made to workflows which has large history. With this change we can reduce the CPU burden on history service during this API call.
How did you test it?
Existing unit tests and manual test to run workflows.
Potential risks
Documentation
Is hotfix candidate?
No