-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit API calls for aws_sfn_state_machine_execution_history
and ignore ExecutionDoesNotExist
errors
#1684
Limit API calls for aws_sfn_state_machine_execution_history
and ignore ExecutionDoesNotExist
errors
#1684
Conversation
16316c4
to
c312371
Compare
This PR needs more work before being complete. |
2a79a07
to
18fb38e
Compare
aws_sfn_state_machine_execution_history
execution_arn
a required key columnaws_sfn_state_machine_execution_history
and ignore ExecutionDoesNotExist
errors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pdecat Thanks for opening this PR with such thoughtful changes and considerations! I've left a few questions and suggestions, can you please have a look?
if !strings.Contains(fmt.Sprint(getListValues(equalQuals["state_machine_arn"].GetListValue())), *stateMachineArn) { | ||
return nil, nil | ||
} | ||
stateMachineArnQuals := getQualsValueByColumn(d.Quals, "state_machine_arn", "string") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Were there any functional differences with this change, or more stylistic changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it uses d.Quals
instead of d.EqualsQuals
, I believe it also makes queries not using (edit: on second thought, =
but in
more performantin
queries are probably processed as several =
queries):
select * from aws_sfn_state_machine_execution
where state_machine_arn in (
select state_machine_arn
from aws_sfn_state_machine
where name = 'mystatemachine'
)
See
- https://github.com/turbot/steampipe-plugin-sdk/blob/2b7903e9b3d0aee3b456819a8896d1fd4d4c73ce/plugin/query_data.go#L393
- https://github.com/turbot/steampipe-plugin-sdk/blob/49556d376edd992a9ff0cb60485bd50c098604ac/plugin/key_column_qual_map.go#L22
- https://github.com/turbot/steampipe-plugin-sdk/blob/d4126181dfc96bcfc3014262326304ed9c2a1401/plugin/key_column_qual.go#L55
- https://github.com/turbot/steampipe-plugin-sdk/blob/49556d376edd992a9ff0cb60485bd50c098604ac/plugin/quals/qual.go#L58
On second thought, this may cause issues if other operators than =
and in
are used, e.g. like
as the string value wouldn't match the checks done afterward. This is because getQualsValueByColumndoes not consider the operator at all for
string` type.
I will do more testing on that.
@@ -287,8 +293,22 @@ func listStepFunctionsStateMachineExecutionHistories(ctx context.Context, d *plu | |||
executionCh := make(chan []historyInfo, len(executions)) | |||
errorCh := make(chan error, len(executions)) | |||
|
|||
// Iterating all the available executions | |||
// Iterating all the available executions matching the query quals, if any |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're given an execution_arn
key qual, I wonder if this function even needs to call the ListExecutions
API method?
Since we'd have a specific execution_arn
when we enter this function, we could instead just call GetExecutionHistory
once, passing in the given execution_arn
and then returning the history data (or 0 rows if there is no history data).
If we don't get an execution_arn
, then we would rely on listing all executions and then getting the histories for each of them (like the function does today).
This isn't normally how we'd want to do it, but due to the lack of nested parent hydrates, this may be a way to reduce the number of API calls we do.
@pdecat Does this approach seem like something that would work and cut down on total number of requests when execution_arn
is passed in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! I've implemented what you suggested, PTAL.
baf90f8
to
f685fb4
Compare
Hey @pdecat , I see a lot of commits have been pushed last week, just wanted to check in if you feel the PR is ready for review again? Or are there additional changes you're looking to make? Thanks! |
Hi @cbruno10, I indeed pushed some fixes to make it usable, but did not find time to fully test it. One controversial change that would be worth reviewing anyway is my last commit where I disable parallelization of requests as the AWS API for retrieving history execution events heavily rate limits requests otherwise. |
…by filtering on execution_arn
Co-authored-by: cbruno10 <[email protected]>
…rt more than 1000 history events with pagination
…s just unfit for concurrent requests, resulting in never ending rate limiting
427f002
to
d62ae20
Compare
@pdecat is this PR ready for final review? Excited to get the changes to the hub 👍. |
Hi @misraved ,
Sadly not 😕 Querying Step Functions execution history somehow works a bit better with this PR for me, but I still haven't managed execute my complex queries against our large dataset without hitting 429 rate limiting errors. Feel free to test or hack with it if you desire. Things I figured that could help IMO are addressing turbot/steampipe-plugin-sdk#194 and turbot/steampipe-plugin-sdk#394 |
Hey @pdecat, I was testing through this table and found a small issue with I believe this could be because the API getting called multiple times using the same |
Hey @pdecat , we're still interested in improving the If/when you're ready for another review, feel free to re-open it, thanks! |
Hi @cbruno10, I'll probably revisit this once turbot/steampipe-plugin-sdk#618 is merged and released. |
In its current state, the aws_sfn_state_machine_execution_history table is unusable for us not matter what criteria are used in queries because it tries to load all execution events from all executions of all state machines, including expired executions, which results in errors, e.g.:
This PR:
aws_sfn_state_machine_execution_history
by filtering onexecution_arn
Ideally, the list hydrate function of the
aws_sfn_state_machine_execution_history
table should depend on the the list hydrate function of theaws_sfn_state_machine_execution
table instead of the one of theaws_sfn_state_machine
table, butParentHydrate
functions cannot be chained across more than two tables right now.Slack thread: https://steampipe.slack.com/archives/C044P668806/p1680791390660499?thread_ts=1680791226.554809&cid=C044P668806
TODO:
aws_sfn_state_machine_*
#1686 is mergedexecution_arn
used inin
queries withgetListValues()
, e.g.:Integration test logs
Logs
Example query results
Results