Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit API calls for aws_sfn_state_machine_execution_history and ignore ExecutionDoesNotExist errors #1684

Closed

Conversation

pdecat
Copy link
Contributor

@pdecat pdecat commented Apr 6, 2023

In its current state, the aws_sfn_state_machine_execution_history table is unusable for us not matter what criteria are used in queries because it tries to load all execution events from all executions of all state machines, including expired executions, which results in errors, e.g.:

2023-04-06 10:21:11.816 UTC [ERROR] steampipe-plugin-aws.plugin: [ERROR] 1680776471843: aws_sfn_state_machine_execution_history.getRowDataForExecutionHistory: api_error="operation error SFN: GetExecutionHistory, https response error StatusCode: 400, RequestID: d118ab6d-033c-453c-abdc-6e52142cac74, ExecutionDoesNotExist: Execution Does Not Exist: 'arn:aws:states:eu-west-3:****:execution:mysfn:eb3a80d6-7efe-4900-8ae8-fce10c75e52d'"

This PR:

  • ignores expired executions for which history is no longer available
  • reduces AWS API calls made by aws_sfn_state_machine_execution_history by filtering on execution_arn

Ideally, the list hydrate function of the aws_sfn_state_machine_execution_history table should depend on the the list hydrate function of the aws_sfn_state_machine_execution table instead of the one of the aws_sfn_state_machine table, but ParentHydrate functions cannot be chained across more than two tables right now.

Slack thread: https://steampipe.slack.com/archives/C044P668806/p1680791390660499?thread_ts=1680791226.554809&cid=C044P668806

TODO:

with constants (r, s_arn, e_arn) as (
    values (
            'eu-west-3',
            'arn:aws:states:eu-west-3:123456789012:stateMachine:mysfn',
            'arn:aws:states:eu-west-3:123456789012:execution:mysfn:map-%'
        )
)
select id,
    execution_arn,
    execution_failed_event_details
from aws_sfn_state_machine_execution_history,
    constants
where region = r
    and execution_arn in (
        select execution_arn
        from aws_sfn_state_machine_execution,
            constants
        where region = r
            and state_machine_arn = s_arn
            and execution_arn like e_arn
    )
    and execution_failed_event_details is not null;

Integration test logs

Logs
Add passing integration test logs here

Example query results

Results
Add example SQL query results here (please include the input queries as well)

Sorry, something went wrong.

@pdecat pdecat force-pushed the fix_sfn_state_machine_execution_history branch from 16316c4 to c312371 Compare April 7, 2023 07:58
@pdecat
Copy link
Contributor Author

pdecat commented Apr 7, 2023

This PR needs more work before being complete.
Moved the first two commits into #1686 which should be mergeable faster.

@pdecat pdecat force-pushed the fix_sfn_state_machine_execution_history branch 2 times, most recently from 2a79a07 to 18fb38e Compare April 7, 2023 16:19
@pdecat pdecat marked this pull request as ready for review April 7, 2023 16:22
@pdecat pdecat changed the title Make aws_sfn_state_machine_execution_history execution_arn a required key column Limit API calls for aws_sfn_state_machine_execution_history and ignore ExecutionDoesNotExist errors Apr 7, 2023
Copy link
Contributor

@cbruno10 cbruno10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pdecat Thanks for opening this PR with such thoughtful changes and considerations! I've left a few questions and suggestions, can you please have a look?

if !strings.Contains(fmt.Sprint(getListValues(equalQuals["state_machine_arn"].GetListValue())), *stateMachineArn) {
return nil, nil
}
stateMachineArnQuals := getQualsValueByColumn(d.Quals, "state_machine_arn", "string")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were there any functional differences with this change, or more stylistic changes?

Copy link
Contributor Author

@pdecat pdecat Apr 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it uses d.Quals instead of d.EqualsQuals, I believe it also makes queries not using = but in more performant (edit: on second thought, in queries are probably processed as several = queries):

select * from aws_sfn_state_machine_execution
where state_machine_arn in (
  select state_machine_arn
  from aws_sfn_state_machine
  where name = 'mystatemachine'
)

See

On second thought, this may cause issues if other operators than = and in are used, e.g. like as the string value wouldn't match the checks done afterward. This is because getQualsValueByColumndoes not consider the operator at all forstring` type.

I will do more testing on that.

@@ -287,8 +293,22 @@ func listStepFunctionsStateMachineExecutionHistories(ctx context.Context, d *plu
executionCh := make(chan []historyInfo, len(executions))
errorCh := make(chan error, len(executions))

// Iterating all the available executions
// Iterating all the available executions matching the query quals, if any
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're given an execution_arn key qual, I wonder if this function even needs to call the ListExecutions API method?

Since we'd have a specific execution_arn when we enter this function, we could instead just call GetExecutionHistory once, passing in the given execution_arn and then returning the history data (or 0 rows if there is no history data).

If we don't get an execution_arn, then we would rely on listing all executions and then getting the histories for each of them (like the function does today).

This isn't normally how we'd want to do it, but due to the lack of nested parent hydrates, this may be a way to reduce the number of API calls we do.

@pdecat Does this approach seem like something that would work and cut down on total number of requests when execution_arn is passed in?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! I've implemented what you suggested, PTAL.

@pdecat pdecat force-pushed the fix_sfn_state_machine_execution_history branch 2 times, most recently from baf90f8 to f685fb4 Compare April 11, 2023 14:09
@cbruno10
Copy link
Contributor

Hey @pdecat , I see a lot of commits have been pushed last week, just wanted to check in if you feel the PR is ready for review again? Or are there additional changes you're looking to make? Thanks!

@pdecat
Copy link
Contributor Author

pdecat commented Apr 17, 2023

Hi @cbruno10, I indeed pushed some fixes to make it usable, but did not find time to fully test it.

One controversial change that would be worth reviewing anyway is my last commit where I disable parallelization of requests as the AWS API for retrieving history execution events heavily rate limits requests otherwise.

@pdecat pdecat force-pushed the fix_sfn_state_machine_execution_history branch from 427f002 to d62ae20 Compare May 17, 2023 08:55
@pdecat pdecat marked this pull request as draft May 17, 2023 13:46
@misraved
Copy link
Contributor

@pdecat is this PR ready for final review?

Excited to get the changes to the hub 👍.

@pdecat
Copy link
Contributor Author

pdecat commented May 26, 2023

Hi @misraved ,

@pdecat is this PR ready for final review?

Sadly not 😕

Querying Step Functions execution history somehow works a bit better with this PR for me, but I still haven't managed execute my complex queries against our large dataset without hitting 429 rate limiting errors.
I've set aside work on this, so I marked it as draft for now.

Feel free to test or hack with it if you desire.

Things I figured that could help IMO are addressing turbot/steampipe-plugin-sdk#194 and turbot/steampipe-plugin-sdk#394

@misraved misraved requested a review from ParthaI July 3, 2023 10:55
@karanpopat
Copy link
Contributor

karanpopat commented Jul 10, 2023

Hey @pdecat, I was testing through this table and found a small issue with ListConfig.
When passing a value to the key column execution_arn here, it returns duplicate rows. The number of duplicate rows I observed was equal to the count of aws_sfn_state_machine_execution resources.

I believe this could be because the API getting called multiple times using the same execution_arn while looping through each parent hydrate item.

@cbruno10
Copy link
Contributor

Hey @pdecat , we're still interested in improving the aws_sfn_state_machine_execution_history table and discussing the approach more. This PR has been open for a while though, so we're going to close it, but happy to continue the discussion in it.

If/when you're ready for another review, feel free to re-open it, thanks!

@cbruno10 cbruno10 closed this Aug 15, 2023
@pdecat
Copy link
Contributor Author

pdecat commented Aug 16, 2023

Hi @cbruno10, I'll probably revisit this once turbot/steampipe-plugin-sdk#618 is merged and released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants