Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow s3 commands to work from non commercial (ie us-gov) environments #1718

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

joelddiaz
Copy link

@joelddiaz joelddiaz commented Mar 8, 2024

Issue #, if available:

N/A

Description of changes:

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

when running under something like us-gov-east-1, the latest-binaries.sh script fails b/c it tries to access the us-west-2 bucket from the wrong endpoints.

this can be avoided by setting AWS_ENDPOINT_URL_S3 to point to
us-west-2, but you still end up trying to use the gov-cloud creds in the
request which would fail with:

$ ./hack/latest-binaries.sh 1.29

An error occurred (InvalidToken) when calling the ListObjectsV2 operation:
The provided token is malformed or otherwise invalid.

so, specify to perform an unauthenticated s3 api request b/c the
govcloud creds wouldn't work against the commercial cloud endpoints.

in other places in the install-worker.sh script, there are 'aws s3'
commands that would fail if running under something like the
us-gov-east-1 environment.

similar to the changes to the latest-binaries.sh script, update the
'aws' cli calls to ensure the requests are unsinged (to avoid trying
to use us-gov creds against a non-gov endpoint).

Testing Done

Tried using the ./hack/latest-binaries.sh when using govcloud credentials to see things not error out.

Performed a full end-to-end AMI build when running in us-gov-east-1.

@joelddiaz joelddiaz force-pushed the unauthenticated-latest-binaries-fetch branch from 64e33f6 to 6510c27 Compare March 18, 2024 15:28
@joelddiaz joelddiaz changed the title allow fetching latest binaries when using other endpoints/creds allow s3 commands to work from non commercial (ie us-gov) environments Mar 18, 2024
templates/al2/provisioners/install-worker.sh Outdated Show resolved Hide resolved
templates/al2/template.json Outdated Show resolved Hide resolved
@joelddiaz
Copy link
Author

@ndbaker1 i went ahead and updated (and tested) the al2023 install-worker.sh script, and some entries in the docs

Copy link
Member

@ndbaker1 ndbaker1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks 👍 alas, the linter strikes again

doc/usage/al2.md Outdated Show resolved Hide resolved
doc/usage/al2023.md Outdated Show resolved Hide resolved
@joelddiaz joelddiaz force-pushed the unauthenticated-latest-binaries-fetch branch from ca61075 to 3d95acb Compare March 22, 2024 12:38
@joelddiaz
Copy link
Author

hopefully the linter gods are pacified this time

@ndbaker1
Copy link
Member

/ci build
+workflow:os_distros al2,al2023
+workflow:k8s_versions 1.29

Copy link
Contributor

@ndbaker1 roger that! I've dispatched a workflow. 👍

Copy link
Contributor

@ndbaker1 the workflow that you requested has completed. 🎉

AMI variantBuildTest
1.29 / al2success ✅skipped ⏭️
1.29 / al2023success ✅skipped ⏭️

hack/latest-binaries.sh Outdated Show resolved Hide resolved
@@ -13,7 +13,7 @@ MINOR_VERSION="${1}"

# retrieve the available "VERSION/BUILD_DATE" prefixes (e.g. "1.28.1/2023-09-14")
# from the binary object keys, sorted in descending semver order, and pick the first one
LATEST_BINARIES=$(aws s3api list-objects-v2 --bucket amazon-eks --prefix "${MINOR_VERSION}" --query 'Contents[*].[Key]' --output text | cut -d'/' -f-2 | sort -Vru | head -n1)
LATEST_BINARIES=$(aws s3api list-objects-v2 --bucket amazon-eks --prefix "${MINOR_VERSION}" --query 'Contents[*].[Key]' --output text --region us-west-2 --no-sign-request | cut -d'/' -f-2 | sort -Vru | head -n1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a big TODO outlining the long term solution (being able to use --region $binary_bucket_region)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also forcing customers in aws-cn to query the bucket in aws, which isn't great.

Copy link

@jtnord jtnord Apr 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per https://github.com/awslabs/amazon-eks-ami/pull/1641/files#r1486814106

whilst this is not "great" it also does not come across as bad. This is just retrieving metadata - and even if that takes an extra 5 seconds, it disappears into the time taken to actually build the image.

@@ -176,6 +177,7 @@
"script": "{{template_dir}}/provisioners/install-worker.sh",
"environment_vars": [
"AWS_ACCESS_KEY_ID={{user `aws_access_key_id`}}",
"AWS_ENDPOINT_URL_S3={{user `aws_endpoint_url_s3`}}",
Copy link
Member

@cartermckinnon cartermckinnon Mar 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be necessary IMO. The BINARY_BUCKET_REGION should be us-west-2 in govcloud for the time being (unless a private bucket is used). I think we need to stop using that variable as a proxy for AWS_REGION (the runtime region) when making other partition-specific decisions

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, now that the region is explicitly set there is no real reason to need to override the S3 endpoint. i've added a commit to remove all the AWS_ENDPOINT_URL_S3 pieces

when running under something like us-gov-east-1, the latest-binaries.sh
script fails b/c it tries to access the us-west-2 bucket from the wrong
endpoints.

this can be avoided by setting AWS_ENDPOINT_URL_S3 to point to
us-west-2, but you still end up trying to use the gov-cloud creds in the
request which would fail with:

$ ./hack/latest-binaries.sh 1.29

An error occurred (InvalidToken) when calling the ListObjectsV2 operation:
The provided token is malformed or otherwise invalid.

so, specify to perform an unauthenticated s3 api request b/c the
govcloud creds wouldn't work against the commercial cloud endpoints.

in other places in the install-worker.sh script, there are 'aws s3'
commands that would fail if running under something like the
us-gov-east-1 environment.

similar to the changes to the latest-binaries.sh script, update the
'aws' cli calls to ensure the requests are unsinged (to avoid trying
to use us-gov creds against a non-gov endpoint).

and plumb through using the user-specified AWS_ENDPOINT_URL_S3 env var
into the install-worker.sh script so that the alternative endpoints can
be used instead of the us-govcloud ones when running in a govcloud
environment.
and re-order the al2 variable for aws_endpoint_url_s3 to be sorted
alphabetically with the rest of the variables.
it isn't necessary now that we are providing the region for the s3
bucket
@joelddiaz joelddiaz force-pushed the unauthenticated-latest-binaries-fetch branch from 522a939 to a733644 Compare April 1, 2024 23:09
@bradwatsonaws
Copy link

bradwatsonaws commented Apr 3, 2024

I certainly appreciate the GovCloud support! But one thing I don't understand about this process is that the hack/latest-binaries.sh script relies on the AWS CLI for execution while the install-worker.sh scripts all attempt to install the AWS CLI long after the fact. If the AWS CLI isn't installed ahead of time, the process will fail from the start in GovCloud. @cartermckinnon had a very good workaround here to use curl in the event of the AWS CLI command failing.

Is the AWS CLI installation section of the install-worker.sh script really just an attempt to upgrade vs install? If so - ok that makes more sense. From my experience I see that the AL2023 minimal AMIs come with the AWS CLI preinstalled but the AL2 minimal AMIs DO NOT, meaning this approach would not work in GovCloud for AL2. Am I off here?

@joelddiaz
Copy link
Author

I certainly appreciate the GovCloud support! But one thing I don't understand about this process is that the hack/latest-binaries.sh script relies on the AWS CLI for execution while the install-worker.sh scripts all attempt to install the AWS CLI long after the fact. If the AWS CLI isn't installed ahead of time, the process will fail from the start in GovCloud. @cartermckinnon had a very good workaround here to use curl in the event of the AWS CLI command failing.

Is the AWS CLI installation section of the install-worker.sh script really just an attempt to upgrade vs install? If so - ok that makes more sense. From my experience I see that the AL2023 minimal AMIs come with the AWS CLI preinstalled but the AL2 minimal AMIs DO NOT, meaning this approach would not work in GovCloud for AL2. Am I off here?

I feel like I'm missing something here regarding the questions...

hack/latest-binaries.sh is a piece that is only used to run locally before kicking of the packer build step(s) for the sole purpose of passing some variables into the packer environment, and the install-worker.sh is the software installation that happens inside the instance/AMI we're building.

I've mostly been working with AL2, and indeed there is no AWS CLI pre-installed on the base-AMI images I've been using.

@bradwatsonaws
Copy link

Ah yes, you are correct @joelddiaz. I conflated the usage. My apologies and again appreciate the GovCloud love here!

@jtnord
Copy link

jtnord commented Apr 11, 2024

would presumably
Fix #1536, #1482
and obsolete #1537, #1641

@mebays
Copy link
Contributor

mebays commented Apr 16, 2024

Is this still actively getting worked on. I would support this change. The buckets seem to only be in us-west-2. The aws cli errors off in gov cloud because it won't find a solution for amazon-eks.s3.us-gov-east-1.amazonaws.com, because there is no redirect or bucket in gov cloud. If hitting the endpoint directly with curl to another region amazon-eks.s3.us-east-1.amazonaws.com that ends up being in us-west-2 regardless because it returns a Permanent Redirect to us-west-2.

If you care to look at the debug logs for awscli aws s3api list-objects-v2 --region us-east-1 --bucket amazon-eks --debug will attempt the us-east-1 region get the redirect message grab the Endpoint url (amazon-eks.s3-us-west-2.amazonaws.com) from the message and retry with the retrieved endpoint.

@bradwatsonaws
Copy link

Any updates on this PR? hack/latest-binaries.sh has been broken in GovCloud for over 9 months now without using a workaround. @joelddiaz @ndbaker1 @cartermckinnon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants