Limit API calls for `aws_sfn_state_machine_execution_history` and ignore `ExecutionDoesNotExist` errors #1684

pdecat · 2023-04-06T16:01:28Z

In its current state, the aws_sfn_state_machine_execution_history table is unusable for us not matter what criteria are used in queries because it tries to load all execution events from all executions of all state machines, including expired executions, which results in errors, e.g.:

2023-04-06 10:21:11.816 UTC [ERROR] steampipe-plugin-aws.plugin: [ERROR] 1680776471843: aws_sfn_state_machine_execution_history.getRowDataForExecutionHistory: api_error="operation error SFN: GetExecutionHistory, https response error StatusCode: 400, RequestID: d118ab6d-033c-453c-abdc-6e52142cac74, ExecutionDoesNotExist: Execution Does Not Exist: 'arn:aws:states:eu-west-3:****:execution:mysfn:eb3a80d6-7efe-4900-8ae8-fce10c75e52d'"

This PR:

ignores expired executions for which history is no longer available
reduces AWS API calls made by aws_sfn_state_machine_execution_history by filtering on execution_arn

Ideally, the list hydrate function of the aws_sfn_state_machine_execution_history table should depend on the the list hydrate function of the aws_sfn_state_machine_execution table instead of the one of the aws_sfn_state_machine table, but ParentHydrate functions cannot be chained across more than two tables right now.

Slack thread: https://steampipe.slack.com/archives/C044P668806/p1680791390660499?thread_ts=1680791226.554809&cid=C044P668806

TODO:

rebase on main once typo: aws_sfn_state_machine_* #1686 is merged
support having execution_arn used in in queries with getListValues(), e.g.:

with constants (r, s_arn, e_arn) as (
    values (
            'eu-west-3',
            'arn:aws:states:eu-west-3:123456789012:stateMachine:mysfn',
            'arn:aws:states:eu-west-3:123456789012:execution:mysfn:map-%'
        )
)
select id,
    execution_arn,
    execution_failed_event_details
from aws_sfn_state_machine_execution_history,
    constants
where region = r
    and execution_arn in (
        select execution_arn
        from aws_sfn_state_machine_execution,
            constants
        where region = r
            and state_machine_arn = s_arn
            and execution_arn like e_arn
    )
    and execution_failed_event_details is not null;

Integration test logs

Logs

Add passing integration test logs here

Example query results

Results

Add example SQL query results here (please include the input queries as well)

pdecat · 2023-04-07T08:00:31Z

This PR needs more work before being complete.
Moved the first two commits into #1686 which should be mergeable faster.

aws/table_aws_ec2_instance.go

cbruno10

@pdecat Thanks for opening this PR with such thoughtful changes and considerations! I've left a few questions and suggestions, can you please have a look?

aws/utils.go

aws/table_aws_sfn_state_machine_execution.go

cbruno10 · 2023-04-07T19:24:39Z

aws/table_aws_sfn_state_machine_execution.go

-			if !strings.Contains(fmt.Sprint(getListValues(equalQuals["state_machine_arn"].GetListValue())), *stateMachineArn) {
-				return nil, nil
-			}
+	stateMachineArnQuals := getQualsValueByColumn(d.Quals, "state_machine_arn", "string")


Were there any functional differences with this change, or more stylistic changes?

As it uses d.Quals instead of d.EqualsQuals, ~~I believe it also makes queries not using = but in more performant~~ (edit: on second thought, in queries are probably processed as several = queries):

select * from aws_sfn_state_machine_execution where state_machine_arn in ( select state_machine_arn from aws_sfn_state_machine where name = 'mystatemachine' )

See

https://github.com/turbot/steampipe-plugin-sdk/blob/2b7903e9b3d0aee3b456819a8896d1fd4d4c73ce/plugin/query_data.go#L393

https://github.com/turbot/steampipe-plugin-sdk/blob/49556d376edd992a9ff0cb60485bd50c098604ac/plugin/key_column_qual_map.go#L22

https://github.com/turbot/steampipe-plugin-sdk/blob/d4126181dfc96bcfc3014262326304ed9c2a1401/plugin/key_column_qual.go#L55

https://github.com/turbot/steampipe-plugin-sdk/blob/49556d376edd992a9ff0cb60485bd50c098604ac/plugin/quals/qual.go#L58

On second thought, this may cause issues if other operators than = and in are used, e.g. like as the string value wouldn't match the checks done afterward. This is because getQualsValueByColumndoes not consider the operator at all forstring` type.

I will do more testing on that.

cbruno10 · 2023-04-07T19:37:45Z

aws/table_aws_sfn_state_machine_execution_history.go

@@ -287,8 +293,22 @@ func listStepFunctionsStateMachineExecutionHistories(ctx context.Context, d *plu
 	executionCh := make(chan []historyInfo, len(executions))
 	errorCh := make(chan error, len(executions))

-	// Iterating all the available executions
+	// Iterating all the available executions matching the query quals, if any


If we're given an execution_arn key qual, I wonder if this function even needs to call the ListExecutions API method?

Since we'd have a specific execution_arn when we enter this function, we could instead just call GetExecutionHistory once, passing in the given execution_arn and then returning the history data (or 0 rows if there is no history data).

If we don't get an execution_arn, then we would rely on listing all executions and then getting the histories for each of them (like the function does today).

This isn't normally how we'd want to do it, but due to the lack of nested parent hydrates, this may be a way to reduce the number of API calls we do.

@pdecat Does this approach seem like something that would work and cut down on total number of requests when execution_arn is passed in?

Good point! I've implemented what you suggested, PTAL.

aws/table_aws_sfn_state_machine_execution_history.go

cbruno10 · 2023-04-17T14:00:49Z

Hey @pdecat , I see a lot of commits have been pushed last week, just wanted to check in if you feel the PR is ready for review again? Or are there additional changes you're looking to make? Thanks!

pdecat · 2023-04-17T14:04:05Z

Hi @cbruno10, I indeed pushed some fixes to make it usable, but did not find time to fully test it.

One controversial change that would be worth reviewing anyway is my last commit where I disable parallelization of requests as the AWS API for retrieving history execution events heavily rate limits requests otherwise.

…by filtering on execution_arn

…ution table

Co-authored-by: cbruno10 <[email protected]>

…_arn

…rt more than 1000 history events with pagination

…s just unfit for concurrent requests, resulting in never ending rate limiting

misraved · 2023-05-26T14:15:56Z

@pdecat is this PR ready for final review?

Excited to get the changes to the hub 👍.

pdecat · 2023-05-26T14:26:54Z

Hi @misraved ,

@pdecat is this PR ready for final review?

Sadly not 😕

Querying Step Functions execution history somehow works a bit better with this PR for me, but I still haven't managed execute my complex queries against our large dataset without hitting 429 rate limiting errors.
I've set aside work on this, so I marked it as draft for now.

Feel free to test or hack with it if you desire.

Things I figured that could help IMO are addressing turbot/steampipe-plugin-sdk#194 and turbot/steampipe-plugin-sdk#394

karanpopat · 2023-07-10T14:10:45Z

Hey @pdecat, I was testing through this table and found a small issue with ListConfig.
When passing a value to the key column execution_arn here, it returns duplicate rows. The number of duplicate rows I observed was equal to the count of aws_sfn_state_machine_execution resources.

I believe this could be because the API getting called multiple times using the same execution_arn while looping through each parent hydrate item.

cbruno10 · 2023-08-15T19:27:43Z

Hey @pdecat , we're still interested in improving the aws_sfn_state_machine_execution_history table and discussing the approach more. This PR has been open for a while though, so we're going to close it, but happy to continue the discussion in it.

If/when you're ready for another review, feel free to re-open it, thanks!

pdecat · 2023-08-16T10:51:24Z

Hi @cbruno10, I'll probably revisit this once turbot/steampipe-plugin-sdk#618 is merged and released.

pdecat force-pushed the fix_sfn_state_machine_execution_history branch from 16316c4 to c312371 Compare April 7, 2023 07:58

pdecat force-pushed the fix_sfn_state_machine_execution_history branch 2 times, most recently from 2a79a07 to 18fb38e Compare April 7, 2023 16:19

pdecat marked this pull request as ready for review April 7, 2023 16:22

pdecat changed the title ~~Make aws_sfn_state_machine_execution_history execution_arn a required key column~~ Limit API calls for aws_sfn_state_machine_execution_history and ignore ExecutionDoesNotExist errors Apr 7, 2023

pdecat commented Apr 7, 2023

View reviewed changes

aws/table_aws_ec2_instance.go Show resolved Hide resolved

cbruno10 requested changes Apr 7, 2023

View reviewed changes

pdecat commented Apr 8, 2023

View reviewed changes

aws/table_aws_sfn_state_machine_execution_history.go Outdated Show resolved Hide resolved

pdecat force-pushed the fix_sfn_state_machine_execution_history branch 2 times, most recently from baf90f8 to f685fb4 Compare April 11, 2023 14:09

pdecat and others added 11 commits May 17, 2023 10:55

Move getListValues from table_aws_ec2_instance.go to utils.go

Verified

This commit was signed with the committer’s verified signature.

Wurstnase Nico Tonnhofer

SSH Key Fingerprint: rfxjA/PBoFf5E5zvo2QdDZdK2Z2p5hJCtlCbQ+fSmyA
Verified
Learn about vigilant mode

6a029f0

Reduce AWS API calls made by aws_sfn_state_machine_execution_history …

3bd2bdb

…by filtering on execution_arn

Ignore expired executions for which history is no longer available

a896bb5

Use getQualsValueByColumn() in aws_sfn_state_machine_execution too

833c9d0

Document valid values for status column of aws_sfn_state_machine_exec…

bfcf0eb

…ution table

Update description style to match others

eed822c

Co-authored-by: cbruno10 <[email protected]>

Use helpers.StringSliceContains()

d8825e2

No need to call ListExecutions() if we're given one or more execution…

a5aae94

…_arn

Apply limit to execution history events and not executions, and suppo…

1ce53b0

…rt more than 1000 history events with pagination

Do not panic if quals does not exist for a column

9c0a7cb

Disable parallelization as the StepFunction API to retrieve history i…

d62ae20

…s just unfit for concurrent requests, resulting in never ending rate limiting

pdecat force-pushed the fix_sfn_state_machine_execution_history branch from 427f002 to d62ae20 Compare May 17, 2023 08:55

pdecat marked this pull request as draft May 17, 2023 13:46

misraved requested a review from ParthaI July 3, 2023 10:55

cbruno10 closed this Aug 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Limit API calls for `aws_sfn_state_machine_execution_history` and ignore `ExecutionDoesNotExist` errors #1684

Limit API calls for `aws_sfn_state_machine_execution_history` and ignore `ExecutionDoesNotExist` errors #1684

pdecat commented Apr 6, 2023 •

edited

Loading

pdecat commented Apr 7, 2023

cbruno10 left a comment

cbruno10 Apr 7, 2023

pdecat Apr 8, 2023 •

edited

Loading

cbruno10 Apr 7, 2023

pdecat Apr 8, 2023

cbruno10 commented Apr 17, 2023

pdecat commented Apr 17, 2023 •

edited

Loading

misraved commented May 26, 2023

pdecat commented May 26, 2023

karanpopat commented Jul 10, 2023 •

edited

Loading

cbruno10 commented Aug 15, 2023

pdecat commented Aug 16, 2023

Limit API calls for aws_sfn_state_machine_execution_history and ignore ExecutionDoesNotExist errors #1684

Limit API calls for aws_sfn_state_machine_execution_history and ignore ExecutionDoesNotExist errors #1684

Conversation

pdecat commented Apr 6, 2023 • edited Loading

Integration test logs

Example query results

pdecat commented Apr 7, 2023

cbruno10 left a comment

Choose a reason for hiding this comment

cbruno10 Apr 7, 2023

Choose a reason for hiding this comment

pdecat Apr 8, 2023 • edited Loading

Choose a reason for hiding this comment

cbruno10 Apr 7, 2023

Choose a reason for hiding this comment

pdecat Apr 8, 2023

Choose a reason for hiding this comment

cbruno10 commented Apr 17, 2023

pdecat commented Apr 17, 2023 • edited Loading

misraved commented May 26, 2023

pdecat commented May 26, 2023

karanpopat commented Jul 10, 2023 • edited Loading

cbruno10 commented Aug 15, 2023

pdecat commented Aug 16, 2023

Limit API calls for `aws_sfn_state_machine_execution_history` and ignore `ExecutionDoesNotExist` errors #1684

Limit API calls for `aws_sfn_state_machine_execution_history` and ignore `ExecutionDoesNotExist` errors #1684

pdecat commented Apr 6, 2023 •

edited

Loading

pdecat Apr 8, 2023 •

edited

Loading

pdecat commented Apr 17, 2023 •

edited

Loading

karanpopat commented Jul 10, 2023 •

edited

Loading