-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add AWS STS authentication support #1884
base: master
Are you sure you want to change the base?
Conversation
.def_property("ssl", &S3Override::ssl, &S3Override::set_ssl); | ||
.def_property("ssl", &S3Override::ssl, &S3Override::set_ssl) | ||
.def_property("aws_auth", | ||
// pybind cannot smartly convert AWSAuthMethod defined in proto from C++ layer to Python layer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the comment
@@ -27,4 +29,6 @@ message Config { | |||
string ca_cert_dir = 15; | |||
// Whether to leave the prefix unchanged, or to translate the periods in it. | |||
bool use_raw_prefix = 16; | |||
arcticc.pb2.storage_pb2.AWSAuthMethod aws_auth = 17; //default to be 0, DISABLED |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm very worried about adding anything to these protobufs after all the issues recently with the https
flag.
There is actually no reason for this to go in the protobuf at all is there? We never intend to serialize it?
We will need extensive compatibility testing (similar to #1862 ) for this change either way.
@@ -86,13 +88,17 @@ S3Storage::S3Storage(const LibraryPath &library_path, OpenMode mode, const Confi | |||
root_folder_(object_store_utils::get_root_folder(library_path)), | |||
bucket_name_(conf.bucket_name()), | |||
region_(conf.region()) { | |||
|
|||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
random whitespace
@@ -22,11 +22,45 @@ Available options for S3: | |||
| access | S3 access key | | |||
| secret | S3 secret access key | | |||
| path_prefix | Path within S3 bucket to use for data storage | | |||
| aws_auth | If true, authentication to endpoint will be computed via AWS environment vars/config files. If no options are provided `aws_auth` will be assumed to be true. | | |||
| aws_auth | AWS authentication method. If setting is `1` (or `true` for backward compatibility), authentication to endpoint will be computed via [AWS default credential provider chain](https://docs.aws.amazon.com/sdk-for-cpp/v1/developer-guide/credproviders.html). If setting is `2`, AWS Security Token Service (STS) will be the authentication method used. If no options are provided AWS authentication will be assumed to be not used. More info is provided below | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use human readable terms for this. default
or true
for the current behaviour, sts
for this new behaviour?
@@ -22,11 +22,45 @@ Available options for S3: | |||
| access | S3 access key | | |||
| secret | S3 secret access key | | |||
| path_prefix | Path within S3 bucket to use for data storage | | |||
| aws_auth | If true, authentication to endpoint will be computed via AWS environment vars/config files. If no options are provided `aws_auth` will be assumed to be true. | | |||
| aws_auth | AWS authentication method. If setting is `1` (or `true` for backward compatibility), authentication to endpoint will be computed via [AWS default credential provider chain](https://docs.aws.amazon.com/sdk-for-cpp/v1/developer-guide/credproviders.html). If setting is `2`, AWS Security Token Service (STS) will be the authentication method used. If no options are provided AWS authentication will be assumed to be not used. More info is provided below | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will be assumed to be not used
replace with, will not be used and you should specify access and secret in the URI
_PermissionCapableFactory: Type["MotoS3StorageFixtureFactory"] = None # To be set later | ||
|
||
logging.getLogger("botocore").setLevel(logging.INFO) | ||
|
||
S3_CONFIG_PATH = os.path.expanduser(os.path.join("~", ".aws", "config")) | ||
S3_BACKUP_CONFIG_PATH = os.path.expanduser(os.path.join("~", ".aws", "bk_config")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's this for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A temporary config file for sts credential and profile used in the test
|
||
|
||
class Key: | ||
def __init__(self, id: str, secret: str, user_name: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make this kwargs only? Its invocations with three strings next to each other are hard to understand
try: | ||
iam_client.create_user(UserName=user_name) | ||
out.sts_test_key = Key(None, None, user_name) | ||
print("User created successfully.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logging not print statements...
RoleName=role_name, | ||
PolicyArn=out.aws_policy_name | ||
) | ||
print("Policyattached to role successfully.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo Policyattached
aws_access_key_id=factory.sts_test_key.id, | ||
aws_secret_access_key=factory.sts_test_key.secret | ||
) | ||
for _ in range(20): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
20 retries seems excessive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per my test, once it requires 13 seconds to get the resources available in S3
(_RBAC_ is not replaced yet) -Add STS auth method to codebase (But no test yet)
9758d14
to
f90fad9
Compare
f90fad9
to
5981d02
Compare
Copy of #1814 so @poodlewars can review
Reference Issues/PRs
#1883
What does this implement or fix?
Add support of AWS STS authentication
Any other comments?
STS is an AWS iam service which allows user account to gain access of resources by assuming role.
First, the SDK will send assume role request to iam. If the authentication is successful, iam will reply with a temporary access key. With that access key, SDK will now have access to the resources.
The above logic and the refresh of temporary access key are handled by the SDK.
P.S. the validity check of the temporary access key is made at the beginning of each IO only,
Maintainers of S3 C++ SDK has refused to align this authentication method with other SDKs (e.g. boto3), which create 2 problems:
Detailed setup of the STS method can be referred to the content of this PR.
The test added in this PR requires real S3. Temporary user, policy and role are created for the test in the pipeline.
The final caveat is seems the config file is loaded at the import of ArcticDB and not being refreshed at the entire lifecycle of the python interpreter. This has required a hack in the testing framework to always create the config file at the beginning of any tests. As it's real S3 test specific, it won't create any pain for day-to-day local development and general PR tests.
Checklist
Checklist for code changes...