Separating Out Access Logging #1695

noah-paige · 2024-11-14T00:24:37Z

Feature or Bugfix

Refactoring

Detail

Move access logging to a separate environment logging bucket (rather than env default bucket used for athena queries and profiling job artifacts)

Relates

Security

Please answer the questions below briefly where applicable, or write N/A. Based on
OWASP 10.

Does this PR introduce or modify any input fields or queries - this includes
fetching data from storage outside the application (e.g. a database, an S3 bucket)?
- Is the input sanitized?
- What precautions are you taking before deserializing the data you consume?
- Is injection prevented by parametrizing queries?
- Have you ensured no eval or similar functions are used?
Does this PR introduce any functionality or component that requires authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
- Are you logging failed auth attempts?
Are you using or adding any cryptographic features?
- Do you use a standard proven implementations?
- Are the used keys controlled by the customer? Where are they stored?
Are you introducing any new policies/roles/users?
- Have you used the least-privilege principle? How?

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

noah-paige · 2024-11-14T14:18:02Z

Tested:

CICD Pipeline
Migration Script Upgrade
Environment Stack Update
Dataset Stack Update
Migration Script Downgrade
Dataset Table Profiling Run Successful
Worksheet Query Successful
Access Logs Written for Env Bucket + All Dataset Buckets to new Logs Bucket

dlpzx · 2024-11-14T16:48:14Z

backend/dataall/core/environment/cdk/environment_stack.py

-        self.default_environment_bucket = default_environment_bucket
-
-        default_environment_bucket.add_to_resource_policy(
+        default_environment_log_bucket.policy.apply_removal_policy(RemovalPolicy.RETAIN)


Isn't this already define in line 174?

removal policy applies to the bucket only - the assocaited policy would get deleted unless we include this line

default_environment_log_bucket.policy.apply_removal_policy(RemovalPolicy.RETAIN)

which then potentially exposes our bucket

dlpzx · 2024-11-14T16:49:23Z

backend/dataall/core/environment/api/types.py

@@ -100,6 +100,7 @@
        gql.Field('subscriptionsConsumersTopicName', type=gql.String),
        gql.Field('subscriptionsProducersTopicName', type=gql.String),
        gql.Field('EnvironmentDefaultBucketName', type=gql.String),
+        gql.Field('EnvironmentLogsBucketName', type=gql.String),


Do we need to expose the information in graphql back to the users? It is a logs bucket, we might want to keep it internal only

Although I see it is used in the integration tests. It is fine as is

it is used in testings

could theoretically be used elsewhere but currentically not used in our front end client as you mentioned

dlpzx · 2024-11-14T16:50:52Z

backend/dataall/core/environment/cdk/environment_stack.py

+            server_access_logs_bucket=default_environment_log_bucket,
+            server_access_logs_prefix=f'access_logs/{self._environment.EnvironmentDefaultBucketName}/',
+        )
+        default_environment_bucket.policy.apply_removal_policy(RemovalPolicy.RETAIN)


same as above, isn't the removal policy already defined in line 196

same response - need to explicitly retain associated resources of bucket (i.e. bucket policy)

dlpzx

Left some nit comments

…1697) ### Feature or Bugfix  - Bugfix ### Detail - Fix integration test teardown of environment bug on cleaning up EnvironmentLogsBucketName ### Relates - #1695 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

noah-paige added 4 commits November 13, 2024 19:23

Add Environment Logs Bucket

1ed62b0

Fix migration scripts to local define Env model

b2e4b3b

add backfilling

cca6475

fix migration script

ec30284

noah-paige marked this pull request as ready for review November 14, 2024 13:49

fix migration script

fd519b7

dlpzx reviewed Nov 14, 2024

View reviewed changes

dlpzx approved these changes Nov 14, 2024

View reviewed changes

noah-paige self-assigned this Nov 14, 2024

noah-paige merged commit b30a354 into main Nov 14, 2024
9 checks passed

noah-paige mentioned this pull request Nov 14, 2024

return EnvironmentLogsBucketName from integraiton test getEnv query #1697

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Separating Out Access Logging #1695

Separating Out Access Logging #1695

noah-paige commented Nov 14, 2024

noah-paige commented Nov 14, 2024 •

edited

Loading

dlpzx Nov 14, 2024

noah-paige Nov 14, 2024

dlpzx Nov 14, 2024

dlpzx Nov 14, 2024 •

edited

Loading

noah-paige Nov 14, 2024

dlpzx Nov 14, 2024

noah-paige Nov 14, 2024

dlpzx left a comment

Separating Out Access Logging #1695

Separating Out Access Logging #1695

Conversation

noah-paige commented Nov 14, 2024

Feature or Bugfix

Detail

Relates

Security

noah-paige commented Nov 14, 2024 • edited Loading

dlpzx Nov 14, 2024

Choose a reason for hiding this comment

noah-paige Nov 14, 2024

Choose a reason for hiding this comment

dlpzx Nov 14, 2024

Choose a reason for hiding this comment

dlpzx Nov 14, 2024 • edited Loading

Choose a reason for hiding this comment

noah-paige Nov 14, 2024

Choose a reason for hiding this comment

dlpzx Nov 14, 2024

Choose a reason for hiding this comment

noah-paige Nov 14, 2024

Choose a reason for hiding this comment

dlpzx left a comment

Choose a reason for hiding this comment

noah-paige commented Nov 14, 2024 •

edited

Loading

dlpzx Nov 14, 2024 •

edited

Loading