Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix hashing step ids in loops #72

Merged
merged 4 commits into from
Sep 13, 2024
Merged

Conversation

albertchae
Copy link
Collaborator

@albertchae albertchae commented Sep 6, 2024

Summary

Per https://github.com/inngest/inngest/blob/main/docs/SDK_SPEC.md#512-ids-and-hashing,
add :n starting with :1 for repeated instances of a step id

Refactored to be a combination of golang and inngest-js implementation to handle edge case of user defined stepId colliding

The inngest-js SDK currently optionally warns of parallel indexing, but
this isn't in scope for beta so I left it out.

Checklist

  • Update documentation N/A documented by sdk spec
  • Added unit/integration tests

Related

Copy link

linear bot commented Sep 6, 2024

@albertchae albertchae force-pushed the albert/INN-3329-hash-steps-loop branch from 10f39be to b4b2f5b Compare September 6, 2024 06:18
Comment on lines 35 to 33
while (true) {
possibleStepId = "$id:$stepNumber"
if (possibleStepId !in stepIds) {
break
}
stepNumber++
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if there's a better way to find the next available stepNumber but looks like this hasn't been a problem in the inngest-js SDK yet so leaving it as is for now

Copy link
Contributor

@darwin67 darwin67 Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doing a loop every time to find the next possible one might not be the most efficient. probably having a map or something of the kind to track state temporarily as it goes through the existing state would be ideal

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, that's the algorithm I was envisioning anyway. updated to store count in a map

int runningCount = 10;
for (int i = 0; i < 5; i++) {
int effectivelyFinalVariableForLambda = runningCount;
runningCount = step.run("add-ten", () -> effectivelyFinalVariableForLambda + 10, Integer.class);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would not compile if I tried to use runningCount directly in the lambda

RunEntry<Object> loopRun = devServer.runsByEvent(loopEvent).first();
assertEquals("Completed", loopRun.getStatus());

assertEquals(60, loopRun.getOutput());
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was returning 20 before fix to the SDK

@albertchae albertchae requested a review from KiKoS0 September 6, 2024 06:21
@albertchae albertchae marked this pull request as ready for review September 6, 2024 06:21

// use the seen count so far for current step, increment for next time
val stepNumber = stepIdsToSeenCount[id]
stepIdsToSeenCount[id] = stepIdsToSeenCount.getValue(id) + 1
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stepIdsToSeenCount[id]++ does not compile because the return value on [] map access is T?, even with the !in check above 😭

https://blog.danlew.net/2017/06/14/convincing-the-kotlin-compiler-that-code-is-safe/

@albertchae albertchae force-pushed the albert/INN-3329-hash-steps-loop branch from f8e69f7 to 7bf72a1 Compare September 6, 2024 14:31
val stepNumber = stepIdsToSeenCount[id]
stepIdsToSeenCount[id] = stepIdsToSeenCount.getValue(id) + 1

return "$id:$stepNumber"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may also need a check here to see is this step ID already exists.

A user could do something like:

  • Run "my-step"
  • Run "my-step:1" (user explicitly specifying this string)
  • Run a loop of "my-step" steps

I think here this would result in us redeclaring "my-step:1" in the first iteration of the loop even though we'd like to skip it and go straight to "my-step:2".

These IDs just being strings means a user can accidentally stumble into our [ID]:[count] format and break some stuff. 😄

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK pushed a commit to combine both a hash for O(1) in most cases and a loop afterwards just in case a user used that stepId already. Correct me if I'm wrong but is this a bug in the Go SDK then https://github.com/inngest/inngestgo/blob/0a00daba0b2db68ff0f080f787cf63f0a63b44d8/internal/sdkrequest/manager.go#L123-L131 @darwin67 ?

This seems like it could potentially be worth reserving some delimiter characters for metadata if Inngest has other cases where it would want to modify the user provided stepId.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works! Thank you.

Mm there's definitely some silly edge case that would be unlikely to hit where users are utilizing *:n step IDs explicitly, but that exists everywhere.

Looks like that'd be a bug in Go too, aye. Long-term we can start to shift this over to something safer; it'd be great to not be directly influencing the ID internally for the hash, but requires a versioned change across SDKs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought of another edge case where the user could reuse a my-step:n name after a loop that used it, so I updated the test and logic to handle that too

Per https://github.com/inngest/inngest/blob/main/docs/SDK_SPEC.md#512-ids-and-hashing,
add `:n` starting with `:1` for repeated instances of a step id

Mostly similar to inngest-js implementation
https://github.com/inngest/inngest-js/blob/79069e1a3d700624ce49b323922c113fc952bcc6/packages/inngest/src/components/execution/v1.ts#L819-L831

The inngest-js SDK currently optionally warns of parallel indexing, but
this isn't in scope for beta so I left it out
The hash means we don't have to loop from 0 every time and for most
cases will just correctly return us the next unused stepId, but looping
afterwards guarantees we don't collide with a user defined stepId.

So this will be O(1) in most cases and potentially O(n) for pathological
functions that have many steps manually named with the `:n` suffix
@albertchae albertchae force-pushed the albert/INN-3329-hash-steps-loop branch from 7bf72a1 to 43373c8 Compare September 12, 2024 04:25
val stepNumber = stepIdsToSeenCount[id]
stepIdsToSeenCount[id] = stepIdsToSeenCount.getValue(id) + 1

return "$id:$stepNumber"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works! Thank you.

Mm there's definitely some silly edge case that would be unlikely to hit where users are utilizing *:n step IDs explicitly, but that exists everywhere.

Looks like that'd be a bug in Go too, aye. Long-term we can start to shift this over to something safer; it'd be great to not be directly influencing the ID internally for the hash, but requires a versioned change across SDKs.

@albertchae albertchae merged commit 5faeb64 into main Sep 13, 2024
9 checks passed
@albertchae albertchae deleted the albert/INN-3329-hash-steps-loop branch September 13, 2024 00:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants