Skip to content

Latest commit

 

History

History
405 lines (349 loc) · 12.6 KB

S3.md

File metadata and controls

405 lines (349 loc) · 12.6 KB

Configuration for S3

[comment] <>: (S3:Documentation->S3:Technology)

Harvester Background Job Logs

  1. create a bucket in S3 called "doaj-harvester". During the creation, ensure that you set all the public access limitations to "True" (i.e. Block all public access)

  2. Create the following policy (IAM > Policies > Create Policy > JSON):

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "s3:PutAnalyticsConfiguration",
                "s3:GetObjectVersionTagging",
                "s3:CreateBucket",
                "s3:ReplicateObject",
                "s3:GetObjectAcl",
                "s3:DeleteBucketWebsite",
                "s3:PutLifecycleConfiguration",
                "s3:GetObjectVersionAcl",
                "s3:PutObjectTagging",
                "s3:DeleteObject",
                "s3:DeleteObjectTagging",
                "s3:GetBucketWebsite",
                "s3:PutReplicationConfiguration",
                "s3:DeleteObjectVersionTagging",
                "s3:GetBucketNotification",
                "s3:PutBucketCORS",
                "s3:GetReplicationConfiguration",
                "s3:ListMultipartUploadParts",
                "s3:PutObject",
                "s3:GetObject",
                "s3:PutBucketNotification",
                "s3:PutBucketLogging",
                "s3:GetAnalyticsConfiguration",
                "s3:GetObjectVersionForReplication",
                "s3:GetLifecycleConfiguration",
                "s3:ListBucketByTags",
                "s3:GetInventoryConfiguration",
                "s3:GetBucketTagging",
                "s3:PutAccelerateConfiguration",
                "s3:DeleteObjectVersion",
                "s3:GetBucketLogging",
                "s3:ListBucketVersions",
                "s3:ReplicateTags",
                "s3:RestoreObject",
                "s3:ListBucket",
                "s3:GetAccelerateConfiguration",
                "s3:GetBucketPolicy",
                "s3:PutEncryptionConfiguration",
                "s3:GetEncryptionConfiguration",
                "s3:GetObjectVersionTorrent",
                "s3:AbortMultipartUpload",
                "s3:PutBucketTagging",
                "s3:GetBucketRequestPayment",
                "s3:GetObjectTagging",
                "s3:GetMetricsConfiguration",
                "s3:DeleteBucket",
                "s3:PutBucketVersioning",
                "s3:ListBucketMultipartUploads",
                "s3:PutMetricsConfiguration",
                "s3:PutObjectVersionTagging",
                "s3:GetBucketVersioning",
                "s3:GetBucketAcl",
                "s3:PutInventoryConfiguration",
                "s3:GetObjectTorrent",
                "s3:PutBucketWebsite",
                "s3:PutBucketRequestPayment",
                "s3:GetBucketCORS",
                "s3:GetBucketLocation",
                "s3:ReplicateDelete",
                "s3:GetObjectVersion"
            ],
            "Resource": [
                "arn:aws:s3:::doaj-harvester",
                "arn:aws:s3:::doaj-harvester/*"
            ]
        }
    ]
}

Give it a sensible name like "doaj-harvester-policy"

  1. Create an S3 user on the IAM service with a suitable name (e.g. "doaj-harvester-client")

Provide Programmatic access, there is no need to provide AWS Management Console access.

Attach the policy created in the previous step

  1. Take a note of the access key and the secret access token (you can't access them again)

  2. Update your local configuration (do not put this in version control) with the following settings:

STORE_S3_SCOPES = {
    # ...
    "harvester" : {
        "aws_access_key_id" : "<access key>",
        "aws_secret_access_key" : "<secret access token>"
    }
    # ...
}

Public Data Dump

[comment] <>: (->PublicDataDump:Feature)

  1. Create a bucket in S3 called "doaj-data-dump". During the creation, ensure that you set all the public access limitations to "False" (i.e. permit public access)

  2. Go to the bucket configuration, under Permissions > Bucket Policy and enter the following policy:

{
  "Id": "Policy1550249912200",
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Stmt1550249907870",
      "Action": [
        "s3:GetObject"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:s3:::doaj-data-dump/*",
      "Principal": "*"
    }
  ]
}
  1. Create the following policy (IAM > Policies > Create Policy > JSON):
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "s3:PutAnalyticsConfiguration",
                "s3:GetObjectVersionTagging",
                "s3:CreateBucket",
                "s3:ReplicateObject",
                "s3:GetObjectAcl",
                "s3:DeleteBucketWebsite",
                "s3:PutLifecycleConfiguration",
                "s3:GetObjectVersionAcl",
                "s3:PutObjectTagging",
                "s3:DeleteObject",
                "s3:GetIpConfiguration",
                "s3:DeleteObjectTagging",
                "s3:GetBucketWebsite",
                "s3:PutReplicationConfiguration",
                "s3:DeleteObjectVersionTagging",
                "s3:GetBucketNotification",
                "s3:PutBucketCORS",
                "s3:GetReplicationConfiguration",
                "s3:ListMultipartUploadParts",
                "s3:PutObject",
                "s3:GetObject",
                "s3:PutBucketNotification",
                "s3:PutBucketLogging",
                "s3:GetAnalyticsConfiguration",
                "s3:GetObjectVersionForReplication",
                "s3:GetLifecycleConfiguration",
                "s3:ListBucketByTags",
                "s3:GetInventoryConfiguration",
                "s3:GetBucketTagging",
                "s3:PutAccelerateConfiguration",
                "s3:DeleteObjectVersion",
                "s3:GetBucketLogging",
                "s3:ListBucketVersions",
                "s3:ReplicateTags",
                "s3:RestoreObject",
                "s3:ListBucket",
                "s3:GetAccelerateConfiguration",
                "s3:GetBucketPolicy",
                "s3:PutEncryptionConfiguration",
                "s3:GetEncryptionConfiguration",
                "s3:GetObjectVersionTorrent",
                "s3:AbortMultipartUpload",
                "s3:PutBucketTagging",
                "s3:GetBucketRequestPayment",
                "s3:GetObjectTagging",
                "s3:GetMetricsConfiguration",
                "s3:DeleteBucket",
                "s3:PutBucketVersioning",
                "s3:ListBucketMultipartUploads",
                "s3:PutMetricsConfiguration",
                "s3:PutObjectVersionTagging",
                "s3:GetBucketVersioning",
                "s3:GetBucketAcl",
                "s3:PutInventoryConfiguration",
                "s3:PutIpConfiguration",
                "s3:GetObjectTorrent",
                "s3:PutBucketWebsite",
                "s3:PutBucketRequestPayment",
                "s3:GetBucketCORS",
                "s3:GetBucketLocation",
                "s3:ReplicateDelete",
                "s3:GetObjectVersion"
            ],
            "Resource": [
                "arn:aws:s3:::doaj-data-dump",
                "arn:aws:s3:::doaj-data-dump/*"
            ]
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": "s3:HeadBucket",
            "Resource": "*"
        }
    ]
}

Give it a sensible name like "doaj-public-data-dump-policy"

  1. Create an S3 user on the IAM service with a suitable name (e.g. "doaj-public-data-dump-client")

Provide Programmatic access, there is no need to provide AWS Management Console access.

Attach the policy created in the previous step

  1. Take a note of the access key and the secret access token (you can't access them again)

  2. Update your local configuration (do not put this in version control) with the following settings:

STORE_S3_SCOPES = {
    # ...
    "public_data_dump" : {
        "aws_access_key_id" : "<access key>",
        "aws_secret_access_key" : "<secret access token>"
    }
    # ...
}

Cached Files

[comment] <>: (->Cache:Feature)

Cached files includes regularly generated content such as the Journal CSV.

  1. Create a bucket in S3 called "doaj-data-cache". During the creation, ensure that you set all the public access limitations to "False" (i.e. permit public access)

  2. Go to the bucket configuration, under Permissions > Bucket Policy and enter the following policy:

{
  "Id": "Policy1550249912200",
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Stmt1550249907870",
      "Action": [
        "s3:GetObject"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:s3:::doaj-data-cache/*",
      "Principal": "*"
    }
  ]
}
  1. Create the following policy (IAM > Policies > Create Policy > JSON):
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "s3:PutAnalyticsConfiguration",
                "s3:GetObjectVersionTagging",
                "s3:CreateBucket",
                "s3:ReplicateObject",
                "s3:GetObjectAcl",
                "s3:DeleteBucketWebsite",
                "s3:PutLifecycleConfiguration",
                "s3:GetObjectVersionAcl",
                "s3:PutObjectTagging",
                "s3:DeleteObject",
                "s3:GetIpConfiguration",
                "s3:DeleteObjectTagging",
                "s3:GetBucketWebsite",
                "s3:PutReplicationConfiguration",
                "s3:DeleteObjectVersionTagging",
                "s3:GetBucketNotification",
                "s3:PutBucketCORS",
                "s3:GetReplicationConfiguration",
                "s3:ListMultipartUploadParts",
                "s3:PutObject",
                "s3:GetObject",
                "s3:PutBucketNotification",
                "s3:PutBucketLogging",
                "s3:GetAnalyticsConfiguration",
                "s3:GetObjectVersionForReplication",
                "s3:GetLifecycleConfiguration",
                "s3:ListBucketByTags",
                "s3:GetInventoryConfiguration",
                "s3:GetBucketTagging",
                "s3:PutAccelerateConfiguration",
                "s3:DeleteObjectVersion",
                "s3:GetBucketLogging",
                "s3:ListBucketVersions",
                "s3:ReplicateTags",
                "s3:RestoreObject",
                "s3:ListBucket",
                "s3:GetAccelerateConfiguration",
                "s3:GetBucketPolicy",
                "s3:PutEncryptionConfiguration",
                "s3:GetEncryptionConfiguration",
                "s3:GetObjectVersionTorrent",
                "s3:AbortMultipartUpload",
                "s3:PutBucketTagging",
                "s3:GetBucketRequestPayment",
                "s3:GetObjectTagging",
                "s3:GetMetricsConfiguration",
                "s3:DeleteBucket",
                "s3:PutBucketVersioning",
                "s3:ListBucketMultipartUploads",
                "s3:PutMetricsConfiguration",
                "s3:PutObjectVersionTagging",
                "s3:GetBucketVersioning",
                "s3:GetBucketAcl",
                "s3:PutInventoryConfiguration",
                "s3:PutIpConfiguration",
                "s3:GetObjectTorrent",
                "s3:PutBucketWebsite",
                "s3:PutBucketRequestPayment",
                "s3:GetBucketCORS",
                "s3:GetBucketLocation",
                "s3:ReplicateDelete",
                "s3:GetObjectVersion"
            ],
            "Resource": [
                "arn:aws:s3:::doaj-data-cache",
                "arn:aws:s3:::doaj-data-cache/*"
            ]
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": "s3:HeadBucket",
            "Resource": "*"
        }
    ]
}

Give it a sensible name like "doaj-data-cache-policy"

  1. Create an S3 user on the IAM service with a suitable name (e.g. "doaj-data-cache-client")

IAM > Users > Add User

Provide Programmatic access, there is no need to provide AWS Management Console access.

Attach the policy created in the previous step

  1. Take a note of the access key and the secret access token (you can't access them again)

  2. Update your local configuration (do not put this in version control) with the following settings:

STORE_S3_SCOPES = {
    # ...
    "cache" : {
        "aws_access_key_id" : "<access key>",
        "aws_secret_access_key" : "<secret access token>"
    }
    # ...
}