Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separating Out Access Logging #1695

Merged
merged 5 commits into from
Nov 14, 2024
Merged

Separating Out Access Logging #1695

merged 5 commits into from
Nov 14, 2024

Conversation

noah-paige
Copy link
Contributor

Feature or Bugfix

  • Refactoring

Detail

  • Move access logging to a separate environment logging bucket (rather than env default bucket used for athena queries and profiling job artifacts)

Relates

Security

Please answer the questions below briefly where applicable, or write N/A. Based on
OWASP 10.

  • Does this PR introduce or modify any input fields or queries - this includes
    fetching data from storage outside the application (e.g. a database, an S3 bucket)?
    • Is the input sanitized?
    • What precautions are you taking before deserializing the data you consume?
    • Is injection prevented by parametrizing queries?
    • Have you ensured no eval or similar functions are used?
  • Does this PR introduce any functionality or component that requires authorization?
    • How have you ensured it respects the existing AuthN/AuthZ mechanisms?
    • Are you logging failed auth attempts?
  • Are you using or adding any cryptographic features?
    • Do you use a standard proven implementations?
    • Are the used keys controlled by the customer? Where are they stored?
  • Are you introducing any new policies/roles/users?
    • Have you used the least-privilege principle? How?

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@noah-paige noah-paige marked this pull request as ready for review November 14, 2024 13:49
@noah-paige
Copy link
Contributor Author

noah-paige commented Nov 14, 2024

Tested:

  • CICD Pipeline
  • Migration Script Upgrade
  • Environment Stack Update
  • Dataset Stack Update
  • Migration Script Downgrade
  • Dataset Table Profiling Run Successful
  • Worksheet Query Successful
  • Access Logs Written for Env Bucket + All Dataset Buckets to new Logs Bucket

self.default_environment_bucket = default_environment_bucket

default_environment_bucket.add_to_resource_policy(
default_environment_log_bucket.policy.apply_removal_policy(RemovalPolicy.RETAIN)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this already define in line 174?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removal policy applies to the bucket only - the assocaited policy would get deleted unless we include this line

        default_environment_log_bucket.policy.apply_removal_policy(RemovalPolicy.RETAIN)

which then potentially exposes our bucket

@@ -100,6 +100,7 @@
gql.Field('subscriptionsConsumersTopicName', type=gql.String),
gql.Field('subscriptionsProducersTopicName', type=gql.String),
gql.Field('EnvironmentDefaultBucketName', type=gql.String),
gql.Field('EnvironmentLogsBucketName', type=gql.String),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to expose the information in graphql back to the users? It is a logs bucket, we might want to keep it internal only

Copy link
Contributor

@dlpzx dlpzx Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although I see it is used in the integration tests. It is fine as is

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is used in testings

could theoretically be used elsewhere but currentically not used in our front end client as you mentioned

server_access_logs_bucket=default_environment_log_bucket,
server_access_logs_prefix=f'access_logs/{self._environment.EnvironmentDefaultBucketName}/',
)
default_environment_bucket.policy.apply_removal_policy(RemovalPolicy.RETAIN)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, isn't the removal policy already defined in line 196

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same response - need to explicitly retain associated resources of bucket (i.e. bucket policy)

Copy link
Contributor

@dlpzx dlpzx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some nit comments

@noah-paige noah-paige self-assigned this Nov 14, 2024
@noah-paige noah-paige merged commit b30a354 into main Nov 14, 2024
9 checks passed
dlpzx pushed a commit that referenced this pull request Nov 15, 2024
…1697)

### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- Fix integration test teardown of environment bug on cleaning up
EnvironmentLogsBucketName


### Relates
- #1695 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants