Skip to content

Setting up Google Cloud Storage

William Silversmith edited this page Sep 26, 2018 · 29 revisions

Google Cloud Storage (GCS) is a popular object storage backend that is supported by CloudVolume. Follow these step by step directions to get started with using CloudVolume and Neuroglancer with GCS.

Setting up a CloudVolume Bucket

In order to use CloudVolume with GCS, you'll need to create a bucket, a namespace within Google's system that holds your files. Then you'll make a service account with read/write permissions to the bucket in order to access it programatically with CloudVolume. You can also access the bucket using the gsutil tool using your command line and through the web interface.

1. Creating Your Account

If you don't already have a GCS account, you'll need to set one up and link a credit or debit card to it. Be advised that as of this writing (Sept. 2018), the default storage cost is 0.026 $/GiB/mo or in connectomics terms, $312 per TiB per year. There are cheaper options, Nearline and Coldline, but they have strings attached. Transferring data out of GCS (termed "egress") to your local system or another provider will cost between $0.08 to $0.12 per GB so you may wish to consider the vendor lock-in implications of that. AWS S3 is similar in their cost structure, though their actual prices may vary significantly.

Once you've decide to move forward, follow the steps here.

2. Create a GCS Bucket

Follow the instructions here. Bucket names are globally (as in worldwide on Earth) unique, so you'll probably want to pick something like "$MYLAB" or "$MYDATASET".

3. Create Service Account Keys

Only authorized users can write to your GCS bucket. In order for CloudVolume to read from or write to your bucket, you'll need to create a Service Account. There are many different authorization schemes in use in the world, but the Service Account is a simple one. You create a programatic user with a secret token that can be revoked. You give the token to CloudVolume, and it will act as that special user.

  1. Go to (this page)[https://console.cloud.google.com/iam-admin/serviceaccounts] (you may have to select your project on the top-left to see anything).
  2. Click "Create Service Account" (near top)
  3. Name the service account and click "Create".
  4. On the next page, you'll be able to grant it roles. Give it "Storage Object Admin", "Storage Object Creator", and "Storage Object Viewer" and click "Continue". 5a. On the next page, grant users you'd like using CloudVolume access to the service account. 5b. On the same page, click "Create Key" and select the JSON option and download this private key to your local machine. This key grants access to your bucket! Protect it!

The key looks like this:

{
  "type": "service_account",
  "project_id": "$PROJECT_ID",
  "private_key_id": "$DIGITS",
  "private_key": "-----BEGIN PRIVATE KEY-----\n$LOTSOFRANDOMCHARACTERS----END PRIVATE KEY-----\n",
  "client_email": "$SERVICEACCOUNTNAME@$PROJECTNAME.iam.gserviceaccount.com",
  "client_id": "$DIGITS",
  "auth_uri": "https://accounts.google.com/o/oauth2/auth",
  "token_uri": "https://oauth2.googleapis.com/token",
  "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
  "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/$SERVICEACCOUNTNAME%$PROJECTNAME.iam.gserviceaccount.com"
}

4. Grant Service Account Permissions

5. Configure CloudVolume with Secrets

Configuring the Bucket for Neuroglancer

1. Download gsutil

2. Set Bucket to Public Read

3. Set CORS Headers