diff --git a/auth-mailchimp-sync/README.md b/auth-mailchimp-sync/README.md index 88406b86d..941dd5892 100644 --- a/auth-mailchimp-sync/README.md +++ b/auth-mailchimp-sync/README.md @@ -35,7 +35,7 @@ Usage of this extension also requires you to have a Mailchimp account. You are r **Configuration Parameters:** -* Deployment location: Where should the extension be deployed? For help selecting a location, refer to the [location selection guide](https://firebase.google.com/docs/functions/locations). +* Cloud Functions location: Where do you want to deploy the functions created for this extension? * Mailchimp API key: What is your Mailchimp API key? To obtain a Mailchimp API key, go to your [Mailchimp account](https://admin.mailchimp.com/account/api/). diff --git a/auth-mailchimp-sync/extension.yaml b/auth-mailchimp-sync/extension.yaml index e019d9f25..31455b57b 100644 --- a/auth-mailchimp-sync/extension.yaml +++ b/auth-mailchimp-sync/extension.yaml @@ -67,11 +67,9 @@ resources: params: - param: LOCATION type: select - label: Deployment location + label: Cloud Functions location description: >- - Where should the extension be deployed? For help selecting a location, - refer to the [location selection - guide](https://firebase.google.com/docs/functions/locations). + Where do you want to deploy the functions created for this extension? options: - label: Iowa (us-central1) value: us-central1 diff --git a/delete-user-data/CHANGELOG.md b/delete-user-data/CHANGELOG.md index 9f24a38b4..cb84dbae6 100644 --- a/delete-user-data/CHANGELOG.md +++ b/delete-user-data/CHANGELOG.md @@ -1,3 +1,7 @@ +## Version 0.1.3 + +feature - Support deletion of directories (issue #148). + ## Version 0.1.2 feature - Add a new param for recursively deleting subcollections in Cloud Firestore (issue #14). diff --git a/delete-user-data/extension.yaml b/delete-user-data/extension.yaml index 3cae0e07c..edb8b2951 100644 --- a/delete-user-data/extension.yaml +++ b/delete-user-data/extension.yaml @@ -15,11 +15,11 @@ name: delete-user-data displayName: Delete User Data specVersion: v1beta -version: 0.1.2 +version: 0.1.3 description: - Deletes data keyed on a userId from Cloud Firestore, Realtime - Database, and/or Cloud Storage when a user deletes their account. + Deletes data keyed on a userId from Cloud Firestore, Realtime Database, and/or + Cloud Storage when a user deletes their account. license: Apache-2.0 billingRequired: false @@ -52,9 +52,10 @@ resources: - name: clearData type: firebaseextensions.v1beta.function description: - Listens for user accounts to be deleted from your project's authenticated users, - then removes any associated user data (based on Firebase Authentication's User ID) from - Realtime Database, Cloud Firestore, and/or Cloud Storage. + Listens for user accounts to be deleted from your project's authenticated + users, then removes any associated user data (based on Firebase + Authentication's User ID) from Realtime Database, Cloud Firestore, and/or + Cloud Storage. properties: sourceDirectory: . location: ${LOCATION} @@ -65,11 +66,12 @@ resources: params: - param: LOCATION type: select - label: Deployment location + label: Cloud Functions location description: >- - Where should the extension be deployed? You usually want a location close to your database. - For help selecting a location, refer to the - [location selection guide](https://firebase.google.com/docs/functions/locations). + Where do you want to deploy the functions created for this extension? + You usually want a location close to your database or Storage bucket. + For help selecting a location, refer to the [location selection + guide](https://firebase.google.com/docs/functions/locations). options: - label: Iowa (us-central1) value: us-central1 @@ -95,21 +97,23 @@ params: example: users/{UID},admins/{UID} required: false description: >- - Which paths in your Cloud Firestore instance contain user data? Leave empty if - you don't use Cloud Firestore. + Which paths in your Cloud Firestore instance contain user data? Leave + empty if you don't use Cloud Firestore. - Enter the full paths, separated by commas. You can represent the User ID of the deleted user with `{UID}`. + Enter the full paths, separated by commas. You can represent the User ID + of the deleted user with `{UID}`. - For example, if you have the collections `users` and `admins`, and each collection - has documents with User ID as document IDs, then you can enter `users/{UID},admins/{UID}`. + For example, if you have the collections `users` and `admins`, and each + collection has documents with User ID as document IDs, then you can enter + `users/{UID},admins/{UID}`. - param: FIRESTORE_DELETE_MODE type: select label: Cloud Firestore delete mode description: >- - (Only applicable if you use the `Cloud Firestore paths` parameter.) How do you want - to delete Cloud Firestore documents? To also delete documents in subcollections, - set this parameter to `recursive`. + (Only applicable if you use the `Cloud Firestore paths` parameter.) How do + you want to delete Cloud Firestore documents? To also delete documents in + subcollections, set this parameter to `recursive`. options: - label: Recursive value: recursive @@ -124,10 +128,11 @@ params: example: users/{UID},admins/{UID} required: false description: >- - Which paths in your Realtime Database instance contain user data? Leave empty if you - don't use Realtime Database. + Which paths in your Realtime Database instance contain user data? Leave + empty if you don't use Realtime Database. - Enter the full paths, separated by commas. You can represent the User ID of the deleted user with `{UID}`. + Enter the full paths, separated by commas. You can represent the User ID + of the deleted user with `{UID}`. For example: `users/{UID},admins/{UID}`. @@ -140,12 +145,14 @@ params: Where in Google Cloud Storage do you store user data? Leave empty if you don't use Cloud Storage. - Enter the full paths, separated by commas. You can represent the User ID of the deleted user with `{UID}`. - You can use `{DEFAULT}` to represent your default bucket. - - For example, if you are using your default bucket, - and the bucket has files with the naming scheme `{UID}-pic.png`, - then you can enter `{DEFAULT}/{UID}-pic.png`. - If you also have files in another bucket called `my-awesome-app-logs`, - and that bucket has files with the naming scheme `{UID}-logs.txt`, - then you can enter `{DEFAULT}/{UID}-pic.png,my-awesome-app-logs/{UID}-logs.txt`. + Enter the full paths to files or directories in your Storage buckets, + separated by commas. Use `{UID}` to represent the User ID of the deleted + user, and use `{DEFAULT}` to represent your default Storage bucket. + + Here's a series of examples. To delete all the files in your default + bucket with the file naming scheme `{UID}-pic.png`, enter + `{DEFAULT}/{UID}-pic.png`. To also delete all the files in another bucket + called my-app-logs with the file naming scheme `{UID}-logs.txt`, enter + `{DEFAULT}/{UID}-pic.png,my-app-logs/{UID}-logs.txt`. To *also* delete a User + ID-labeled directory and all its files (like `media/{UID}`), enter + `{DEFAULT}/{UID}-pic.png,my-app-logs/{UID}-logs.txt,{DEFAULT}/media/{UID}`. diff --git a/delete-user-data/functions/lib/index.js b/delete-user-data/functions/lib/index.js index 5c20b7d6b..c24f1827e 100644 --- a/delete-user-data/functions/lib/index.js +++ b/delete-user-data/functions/lib/index.js @@ -91,19 +91,20 @@ const clearStorageData = (storagePaths, uid) => __awaiter(void 0, void 0, void 0 const bucket = bucketName === "{DEFAULT}" ? admin.storage().bucket() : admin.storage().bucket(bucketName); - const file = bucket.file(parts.slice(1).join("/")); - const bucketFilePath = `${bucket.name}/${file.name}`; + const prefix = parts.slice(1).join("/"); try { - logs.storagePathDeleting(bucketFilePath); - yield file.delete(); - logs.storagePathDeleted(bucketFilePath); + logs.storagePathDeleting(prefix); + yield bucket.deleteFiles({ + prefix, + }); + logs.storagePathDeleted(prefix); } catch (err) { if (err.code === 404) { - logs.storagePath404(bucketFilePath); + logs.storagePath404(prefix); } else { - logs.storagePathError(bucketFilePath, err); + logs.storagePathError(prefix, err); } } })); diff --git a/delete-user-data/functions/src/index.ts b/delete-user-data/functions/src/index.ts index 1d30786b6..e6c96c3e3 100644 --- a/delete-user-data/functions/src/index.ts +++ b/delete-user-data/functions/src/index.ts @@ -92,17 +92,18 @@ const clearStorageData = async (storagePaths: string, uid: string) => { bucketName === "{DEFAULT}" ? admin.storage().bucket() : admin.storage().bucket(bucketName); - const file = bucket.file(parts.slice(1).join("/")); - const bucketFilePath = `${bucket.name}/${file.name}`; + const prefix = parts.slice(1).join("/"); try { - logs.storagePathDeleting(bucketFilePath); - await file.delete(); - logs.storagePathDeleted(bucketFilePath); + logs.storagePathDeleting(prefix); + await bucket.deleteFiles({ + prefix, + }); + logs.storagePathDeleted(prefix); } catch (err) { if (err.code === 404) { - logs.storagePath404(bucketFilePath); + logs.storagePath404(prefix); } else { - logs.storagePathError(bucketFilePath, err); + logs.storagePathError(prefix, err); } } }); diff --git a/firestore-bigquery-export/POSTINSTALL.md b/firestore-bigquery-export/POSTINSTALL.md index b76d78c0c..fc3cebb4b 100644 --- a/firestore-bigquery-export/POSTINSTALL.md +++ b/firestore-bigquery-export/POSTINSTALL.md @@ -13,13 +13,15 @@ You can test out this extension right away: 1. Query your **raw changelog table**, which should contain a single log of creating the `bigquery-mirror-test` document. ``` - SELECT * FROM `${param:PROJECT_ID}.${param:DATASET_ID}.${param:TABLE_ID}_raw_changelog` + SELECT * + FROM `${param:PROJECT_ID}.${param:DATASET_ID}.${param:TABLE_ID}_raw_changelog` ``` 1. Query your **latest view**, which should return the latest change event for the only document present -- `bigquery-mirror-test`. ``` - SELECT * FROM `${param:PROJECT_ID}.${param:DATASET_ID}.${param:TABLE_ID}_raw_latest` + SELECT * + FROM `${param:PROJECT_ID}.${param:DATASET_ID}.${param:TABLE_ID}_raw_latest` ``` 1. Delete the `bigquery-mirror-test` document from [Cloud Firestore](https://console.firebase.google.com/project/${param:PROJECT_ID}/database/firestore/data). @@ -28,9 +30,10 @@ The `bigquery-mirror-test` document will disappear from the **latest view** and 1. You can check the changelogs of a single document with this query: ``` - SELECT * FROM `${param:PROJECT_ID}.${param:DATASET_ID}.${param:TABLE_ID}_raw_changelog` - WHERE document_name = "bigquery-mirror-test" - ORDER BY TIMESTAMP ASC + SELECT * + FROM `${param:PROJECT_ID}.${param:DATASET_ID}.${param:TABLE_ID}_raw_changelog` + WHERE document_name = "bigquery-mirror-test" + ORDER BY TIMESTAMP ASC ``` ### Using the extension @@ -48,13 +51,17 @@ Note that this extension only listens for _document_ changes in the collection, This extension only sends the content of documents that have been changed -- it does not export your full dataset of existing documents into BigQuery. So, to backfill your BigQuery dataset with all the documents in your collection, you can run the import script provided by this extension. -The import script can read all existing documents in a Cloud Firestore collection and insert them into the raw changelog table created by this extension. The script adds a special changelog for each document with the operation of `IMPORT` and the timestamp of epoch. This is to ensure that any operation on an imported document supersedes the `IMPORT` +The import script can read all existing documents in a Cloud Firestore collection and insert them into the raw changelog table created by this extension. The script adds a special changelog for each document with the operation of `IMPORT` and the timestamp of epoch. This is to ensure that any operation on an imported document supersedes the `IMPORT`. -**Important:** Run the script over the entire collection _after_ installing this extension, otherwise all writes to your database during the import might be lost. +**Important:** Run the import script over the entire collection _after_ installing this extension, otherwise all writes to your database during the import might be lost. -You may pause and resume the script from the last batch at any point. +Learn more about using the import script to [backfill your existing collection](https://github.com/firebase/extensions/blob/master/firestore-bigquery-export/guides/IMPORT_EXISTING_DOCUMENTS.md). -Learn more about using this script to [backfill your existing collection](https://github.com/firebase/extensions/blob/master/firestore-bigquery-export/guides/IMPORT_EXISTING_DOCUMENTS.md). +### _(Optional)_ Generate schema views + +After your data is in BigQuery, you can use the schema-views script (provided by this extension) to create views that make it easier to query relevant data. You only need to provide a JSON schema file that describes your data structure, and the schema-views script will create the views. + +Learn more about using the schema-views script to [generate schema views](https://github.com/firebase/extensions/blob/master/firestore-bigquery-export/guides/GENERATE_SCHEMA_VIEWS.md). ### Monitoring diff --git a/firestore-bigquery-export/PREINSTALL.md b/firestore-bigquery-export/PREINSTALL.md index 6eabf718d..8b94b2f73 100644 --- a/firestore-bigquery-export/PREINSTALL.md +++ b/firestore-bigquery-export/PREINSTALL.md @@ -16,11 +16,15 @@ Before installing this extension, you'll need to: + [Set up Cloud Firestore in your Firebase project.](https://firebase.google.com/docs/firestore/quickstart) + [Link your Firebase project to BigQuery.](https://support.google.com/firebase/answer/6318765) -This extension only sends the content of documents that have been changed -- it does not export your full dataset of existing documents into BigQuery. So, to backfill your BigQuery dataset with all the documents in your collection, you can run the import script provided by this extension. +#### Backfill your BigQuery dataset -**Important:** Run the script over the entire collection _after_ installing this extension, otherwise all writes to your database during the import might be lost. +This extension only sends the content of documents that have been changed -- it does not export your full dataset of existing documents into BigQuery. So, to backfill your BigQuery dataset with all the documents in your collection, you can run the [import script](https://github.com/firebase/extensions/blob/master/firestore-bigquery-export/guides/IMPORT_EXISTING_DOCUMENTS.md) provided by this extension. -Learn more about using this script to [backfill your existing collection](https://github.com/firebase/extensions/blob/master/firestore-bigquery-export/guides/IMPORT_EXISTING_DOCUMENTS.md). +**Important:** Run the import script over the entire collection _after_ installing this extension, otherwise all writes to your database during the import might be lost. + +#### Generate schema views + +After your data is in BigQuery, you can run the [schema-views script](https://github.com/firebase/extensions/blob/master/firestore-bigquery-export/guides/GENERATE_SCHEMA_VIEWS.md) (provided by this extension) to create views that make it easier to query relevant data. You only need to provide a JSON schema file that describes your data structure, and the schema-views script will create the views. #### Billing diff --git a/firestore-bigquery-export/README.md b/firestore-bigquery-export/README.md index a149c02f1..48c6b0c39 100644 --- a/firestore-bigquery-export/README.md +++ b/firestore-bigquery-export/README.md @@ -22,11 +22,15 @@ Before installing this extension, you'll need to: + [Set up Cloud Firestore in your Firebase project.](https://firebase.google.com/docs/firestore/quickstart) + [Link your Firebase project to BigQuery.](https://support.google.com/firebase/answer/6318765) -This extension only sends the content of documents that have been changed -- it does not export your full dataset of existing documents into BigQuery. So, to backfill your BigQuery dataset with all the documents in your collection, you can run the import script provided by this extension. +#### Backfill your BigQuery dataset -**Important:** Run the script over the entire collection _after_ installing this extension, otherwise all writes to your database during the import might be lost. +This extension only sends the content of documents that have been changed -- it does not export your full dataset of existing documents into BigQuery. So, to backfill your BigQuery dataset with all the documents in your collection, you can run the [import script](https://github.com/firebase/extensions/blob/master/firestore-bigquery-export/guides/IMPORT_EXISTING_DOCUMENTS.md) provided by this extension. -Learn more about using this script to [backfill your existing collection](https://github.com/firebase/extensions/blob/master/firestore-bigquery-export/guides/IMPORT_EXISTING_DOCUMENTS.md). +**Important:** Run the import script over the entire collection _after_ installing this extension, otherwise all writes to your database during the import might be lost. + +#### Generate schema views + +After your data is in BigQuery, you can run the [schema-views script](https://github.com/firebase/extensions/blob/master/firestore-bigquery-export/guides/GENERATE_SCHEMA_VIEWS.md) (provided by this extension) to create views that make it easier to query relevant data. You only need to provide a JSON schema file that describes your data structure, and the schema-views script will create the views. #### Billing @@ -43,7 +47,7 @@ When you use Firebase Extensions, you're only charged for the underlying resourc **Configuration Parameters:** -* Deployment location: Where should the extension be deployed? You usually want a location close to your database. For help selecting a location, refer to the [location selection guide](https://firebase.google.com/docs/functions/locations). +* Cloud Functions location: Where do you want to deploy the functions created for this extension? You usually want a location close to your database. For help selecting a location, refer to the [location selection guide](https://firebase.google.com/docs/functions/locations). Note that this extension locates your BigQuery dataset in `us-central1`. * Collection path: What is the path of the collection that you would like to export? You may use `{wildcard}` notation to match a subcollection of all documents in a collection (for example: `chatrooms/{chatid}/posts`). diff --git a/firestore-bigquery-export/extension.yaml b/firestore-bigquery-export/extension.yaml index 493574549..b919270fa 100644 --- a/firestore-bigquery-export/extension.yaml +++ b/firestore-bigquery-export/extension.yaml @@ -58,11 +58,13 @@ resources: params: - param: LOCATION type: select - label: Deployment location + label: Cloud Functions location description: >- - Where should the extension be deployed? You usually want a location close to your database. - For help selecting a location, refer to the - [location selection guide](https://firebase.google.com/docs/functions/locations). + Where do you want to deploy the functions created for this extension? + You usually want a location close to your database. For help selecting a + location, refer to the [location selection + guide](https://firebase.google.com/docs/functions/locations). + Note that this extension locates your BigQuery dataset in `us-central1`. options: - label: Iowa (us-central1) value: us-central1 diff --git a/firestore-bigquery-export/guides/EXAMPLE_QUERIES.md b/firestore-bigquery-export/guides/EXAMPLE_QUERIES.md new file mode 100644 index 000000000..210d251d0 --- /dev/null +++ b/firestore-bigquery-export/guides/EXAMPLE_QUERIES.md @@ -0,0 +1,110 @@ +These example queries are for use with the official Firebase Extension +[_Export Collections to BigQuery_](https://github.com/firebase/extensions/tree/master/firestore-bigquery-export) +and its associated [`fs-bq-schema-views` script](https://github.com/firebase/extensions/blob/master/firestore-bigquery-export/guides/GENERATE_SCHEMA_VIEWS.md) (referred to as the "schema-views script"). + +The queries use the following parameter values from your installation of the extension: + +- `${param:PROJECT_ID}`: the project ID for the Firebase project in + which you installed the extension +- `${param:DATASET_ID}`: the ID that you specified for your dataset during + extension installation +- `${param:TABLE_ID}`: the common prefix of BigQuery views to generate + +**Note:** You can, at any time, run the schema-views script against additional schema files +to create different schema views over your raw changelog. When you settle on a fixed schema, +you can create a [scheduled query](https://cloud.google.com/bigquery/docs/scheduling-queries) +to transfer the columns reported by the schema view to a persistent backup table. + +Assume that you have a schema view matching the following configuration from a +schema file: + +``` +{ + "fields": [ + { + "name": "name", + "type": "string" + }, + { + "name":"favorite_numbers", + "type": "array" + }, + { + "name": "last_login", + "type": "timestamp" + }, + { + "name": "last_location", + "type": "geopoint" + }, + { + "fields": [ + { + "name": "name", + "type": "string" + } + ], + "name": "friends", + "type": "map" + } + ] +} +``` + +### Example query for a timestamp + +You can generate a listing of users that have logged in to the app as follows: + +``` +SELECT name, last_login +FROM ${param:PROJECT_ID}.${param:DATASET_ID}.${param:TABLE_ID}_schema_${SCHEMA_FILE_NAME}_latest +ORDER BY last_login DESC +``` + +In this query, note the following: + +- `${SCHEMA_FILE_NAME}` is the name of the schema file that you + provided as an argument to run the schema-views script. + +- The `last_login` column contains data that is stored in the `data` + column of the raw changelog. The type conversion and view generation is + performed for you by the + [_Export Collections to BigQuery_](https://github.com/firebase/extensions/tree/master/firestore-bigquery-export) + extension. + +### Example queries for an array + +The example schema configuration (see above) stores each user's favorite number +in a Cloud Firestore array called `favorite_numbers`. Here are some example +queries for that data: + +- If you wanted to determine how many favorite numbers each user + currently has, then you can run the following query: + + ``` + SELECT document_name, MAX(favorite_numbers_index) + FROM ${param:PROJECT_ID}.users.users_schema_user_full_schema_latest + GROUP BY document_name + ``` + +- If you wanted to determine the what the current favorite numbers are + of the app's users (assuming that number is stored in the first position of + the `favorite_numbers` array), you can run the following query: + + ``` + SELECT document_name, favorite_numbers_member + FROM ${param:PROJECT_ID}.users.users_schema_user_full_schema_latest + WHERE favorite_numbers_index = 0 + ``` + +### Example query if you have multiple arrays + +If you had multiple arrays in the schema configuration, you might have to select +all `DISTINCT` documents to eliminate the redundant rows introduced by the +cartesian product of `CROSS JOIN`. + +``` +SELECT DISTINCT document_name, favorite_numbers_member +FROM ${param:PROJECT_ID}.users.users_schema_user_full_schema_latest +WHERE favorite_numbers_index = 0 +``` diff --git a/firestore-bigquery-export/guides/GENERATE_SCHEMA_VIEWS.md b/firestore-bigquery-export/guides/GENERATE_SCHEMA_VIEWS.md new file mode 100644 index 000000000..757d2cdbf --- /dev/null +++ b/firestore-bigquery-export/guides/GENERATE_SCHEMA_VIEWS.md @@ -0,0 +1,379 @@ +The `fs-bq-schema-views` script is for use with the official Firebase Extension +[_Export Collections to BigQuery_](https://github.com/firebase/extensions/tree/master/firestore-bigquery-export). + +## Overview + +The `fs-bq-schema-views` script (referred to as the "schema-views script") +generates richly-typed BigQuery views of your raw changelog. + +The _Export Collections to BigQuery_ extension only mirrors raw data, but it +doesn't apply schemas or types. This decoupling makes schema validation less +risky because no data can be lost due to schema mismatch or unknown fields. + +The schema-views script creates a BigQuery view, based on a JSON schema +configuration file, using +[BigQuery's built-in JSON functions](https://cloud.google.com/bigquery/docs/reference/standard-sql/json_functions). +The _Export Collections to BigQuery_ extension also provides some BigQuery +[user-defined functions](https://github.com/firebase/extensions/blob/master/firestore-bigquery-export/scripts/gen-schema-view/src/udf.ts) +that are helpful in converting Firestore document properties to richly-typed +BigQuery cells. + +## Use the script + +The following steps are an example of how to use the schema-views script. In the +sections at the end of this file, you can review detailed information about +configuring a schema file and reviewing the resulting schema views. + +### Step 1: Create a schema file + +The schema-views script runs against "schema files" which specify your schema +configurations in a JSON format. + +In any directory, create a schema file called `test_schema.json` +that contains the following: + +``` +{ + "fields": [ + { + "name": "name", + "type": "string" + }, + { + "name": "age", + "type": "number" + } + ] +} +``` + +Learn [How to configure schema files](#how-to-configure-schema-files) +later in this guide. + +### Step 2: Set up credentials + +The schema-views script uses Application Default Credentials to communicate with +BigQuery. + +One way to set up these credentials is to run the following command using the +[gcloud](https://cloud.google.com/sdk/gcloud/) CLI: + +``` +$ gcloud auth application-default login +``` + +Alternatively, you can +[create and use a service account](https://cloud.google.com/docs/authentication/production#obtaining_and_providing_service_account_credentials_manually). +This service account must be assigned a role that grants the permission of +`bigquery.jobs.create`, like the ["BigQuery Job User" role](https://cloud.google.com/iam/docs/understanding-roles#bigquery-roles). + +### Step 3: Run the script + +The schema-views script uses the following parameter values from your +installation of the extension: + +- `${param:PROJECT_ID}`: the project ID for the Firebase project in + which you installed the extension +- `${param:DATASET_ID}`: the ID that you specified for your dataset during + extension installation +- `${param:TABLE_ID}`: the common prefix of BigQuery views to generate + +Run the schema-views script using +[`npx` (the Node Package Runner)](https://github.com/npm/npx#npx1----execute-npm-package-binaries) +via `npm` (the Node Package Manager). + +1. Make sure that you've installed the required tools to run the + schema-views script: + + - To access the `npm` command tools, you need to install + [Node.js](https://www.nodejs.org/). + - If you use npm v5.1 or earlier, you need to explicitly install `npx`. + Run `npm install --global npx`. + +1. Run the schema-views script via `npx` by running the following command: + + ``` + $ npx @firebaseextensions/fs-bq-schema-views \ + --non-interactive \ + --project=${param:PROJECT_ID} \ + --dataset=${param:DATASET_ID} \ + --table-name-prefix=${param:TABLE_ID} \ + --schema-files=./test_schema.json + ``` + + **Note:** You can run the schema-views script from any directory, but + you need to specify the path to your schema file using the `--schema-files` + flag. To run the schema-views script against multiple schema files, specify + each file in a comma-separated list + (for example: `--schema-files=./test_schema.json,./test_schema2.json`). + +### Step 4: View results + +1. In the [BigQuery web UI](https://console.cloud.google.com/bigquery), + navigate to the generated schema changelog view: + `https://console.cloud.google.com/bigquery?project=${param:PROJECT_ID}&p=${param:PROJECT_ID}&d=${param:DATASET_ID}&t=${param:TABLE_ID}_schema_test_schema_changelog&page=table`. + + This view allows you to query document change events by fields specified in + the schema. + +1. In the [Firebase console](https://console.firebase.google.com/), + go to the Cloud Firestore section, + then create a document called `test-schema-document` with two fields: + + - A field of type `string` called "name" + - A field of type `number` called "age" + +1. Back in BigQuery, run the following query in the schema changelog + view (that is, `https://console.cloud.google.com/bigquery?project=${param:PROJECT_ID}&p=${param:PROJECT_ID}&d=${param:DATASET_ID}&t=${param:TABLE_ID}_schema_test_schema_changelog&page=table`): + + ``` + SELECT document_name, name, age + FROM ${param:PROJECT_ID}.${param:DATASET_ID}.${param:TABLE_ID}_schema_test_schema_changelog + WHERE document_name = "test-schema-document" + ``` + +1. Go back to the Cloud Firestore section of the console, then change + the type of the "age" field to be a string. + +1. Back in BigQuery, re-run the following query: + + ``` + SELECT document_name, name, age + FROM ${param:PROJECT_ID}.${param:DATASET_ID}.${param:TABLE_ID}_schema_test_schema_changelog + WHERE document_name = "test-schema-document" + ``` + + You'll see a new change with a `null` age column. If you query documents + that don't match the schema, then the view contains null values for the + corresponding schema fields. + +1. Back in the Cloud Firestore section in the console, delete + `test-schema-document`. + +1. _(Optional)_ As with the raw views, you can also query events on the + view of the documents currently in the collection by using the latest + schema view + (that is, `https://console.cloud.google.com/bigquery?project=${param:PROJECT_ID}&p=${param:PROJECT_ID}&d=${param:DATASET_ID}&t=${param:COLLECTION_PATH}_schema_test_schema_latest&page=table`. + + Back in BigQuery, if you run the following query, you'll receive no + results because the document no longer exists in the Cloud Firestore + collection. + + ``` + SELECT document_name, name, age + FROM ${param:PROJECT_ID}.${param:DATASET_ID}.${param:TABLE_ID}_schema_test_schema_latest + WHERE document_name = "test-schema-document" + ``` + +### Next Steps + +- [Create your own schema files](#how-to-configure-schema-files) +- [Troubleshoot common issues](#common-schema-file-configuration-mistakes) +- [Learn about the columns in a schema view](#columns-in-a-schema-view) +- [Take a look at more SQL examples](https://github.com/firebase/extensions/blob/master/firestore-bigquery-export/guides/EXAMPLE_QUERIES.md) + +## How to configure schema files + +To generate schema views of your raw changelog, you must create at least one +schema JSON file. + +Here's an example of a configuration that a schema file might contain: + +``` +{ + "fields": [ + { + "name": "name", + "type": "string" + }, + { + "name":"favorite_numbers", + "type": "array" + }, + { + "name": "last_login", + "type": "timestamp" + }, + { + "name": "last_location", + "type": "geopoint" + }, + { + "fields": [ + { + "name": "name", + "type": "string" + } + ], + "name": "friends", + "type": "map" + } + ] +} +``` + +The root of the configuration must have a `fields` array that contains objects +which describe the elements in the schema. If one of the objects is of type +`map`, it must specify its own `fields` array describing the members of that +map. + +Each `fields` array must contain _at least one_ of the following types: + +- `string` +- `array` +- `map` +- `boolean` +- `number` +- `timestamp` +- `geopoint` +- `reference` +- `null` + +These types correspond with Cloud Firestore's +[supported data types](https://firebase.google.com/docs/firestore/manage-data/data-types). +Make sure that the types that you specify match the types of the fields in your +Cloud Firestore collection. + +You may create any number of schema files to use with the schema-views script. +The schema-views script generates the following views for _each_ schema file: + +- `${param:PROJECT_ID}.${param:DATASET_ID}.${param:TABLE_ID}_schema_${SCHEMA_FILE_NAME}_changelog` +- `${param:PROJECT_ID}.${param:DATASET_ID}.${param:TABLE_ID}_schema_${SCHEMA_FILE_NAME}_latest` + +Here, `${SCHEMA_FILE_NAME}` is the name of each schema file that you provided as +an argument to run the schema-views script. + +### Common schema file configuration mistakes + +Be aware of the following common mistakes when configuring a schema file: + + + + + + + + + + + + + + + + + + + + + + + + + + +
Mistake in schema file configOutcome of mistake
Omitting a relevant fieldThe generated view will not contain a column for that field.
Specifying the wrong type for a relevant fieldType conversion (see previous section) will fail and the resulting column + will contain a BigQuery null value in lieu of the desired + value.
Specifying a schema field that doesn't exist in the underlying raw + changelogQuerying the column for that field will return a BigQuery null + value instead of the desired value.
Writing invalid JSONThe schema-views script cannot generate a view
+ +Since all document data is stored in the schemaless changelog, mistakes in +schema configuration don't affect the underlying data and can be resolved by +re-running the schema-views script against an updated schema file. + +## About Schema Views + +### Views created by the script + +- `${param:PROJECT_ID}.${param:DATASET_ID}.${param:TABLE_ID}_schema_${SCHEMA_FILE_NAME}_changelog` + + This view is a table which contains all rows present in the raw changelog. + This view is analogous to the raw change-log, only it has typed columns + corresponding to fields of the schema. + +- `${param:PROJECT_ID}.${param:DATASET_ID}.${param:TABLE_ID}_schema_${SCHEMA_FILE_NAME}_latest` + + This view stores typed rows for the documents currently in the collection. + This view is analogous to the "latest" view on the raw changelog, only it + includes the typed columns corresponding to fields in the corresponding + schema file. + + Since `GEOGRAPHY` fields are not groupable entities in BigQuery (and the + query which builds the latest view of a collection of documents requires + grouping on the schema columns), the latest schema omits any `GEOGRAPHY` + columns and, instead, splits them out into two `NUMERIC` columns called + `${FIRESTORE_GEOPOINT}_latitude` and `${FIRESTORE_GEOPOINT}_longitude`. + +### Columns in a schema view + +Each schema view carries with it the following fields from the raw changelog: + +- `document_name STRING REQUIRED` +- `timestamp TIMESTAMP REQUIRED` +- `operation STRING REQUIRED` + +The remaining columns correspond to fields of the schema and are assigned types +based on the corresponding Cloud Firestore types those fields have. With the +exception of `map` and `array`, the type conversion scheme is as follows: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Cloud Firestore typeBigQuery type
stringSTRING
booleanBOOLEAN
numberNUMERIC
timestampTIMESTAMP
geopointGEOGRAPHY
referenceSTRING
(containing the path to the referenced document)
nullNULL
+ +#### Cloud Firestore maps + +Cloud Firestore maps are interpreted recursively. If you include a map in your +schema configuration, the resulting view will contain columns for whatever +fields that map contains. If the map doesn't contain any fields, the map is +ignored by the schema-views script. + +#### Cloud Firestore arrays + +Review [these examples](https://github.com/firebase/extensions/blob/master/firestore-bigquery-export/guides/EXAMPLE_QUERIES.md#example-queries-for-an-array) for querying an array. + +Cloud Firestore arrays are +[unnested](https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#unnest), +so all array fields of the document are +[cross joined](https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#cross-join) +in the output table. The view retains the member and offset columns, which are +called `${FIRESTORE_ARRAY_NAME}_member` and `${FIRESTORE_ARRAY_NAME}_index`, +respectively. To make querying easier, the view includes these two columns +instead of the original `ARRAY` value field. + +If the array is empty, it will be ignored by the schema-views script. diff --git a/firestore-counter/CHANGELOG.md b/firestore-counter/CHANGELOG.md index db0bbc1b6..aeecccbed 100644 --- a/firestore-counter/CHANGELOG.md +++ b/firestore-counter/CHANGELOG.md @@ -1,3 +1,7 @@ +## Version 0.1.2 + +feature - Limit shards to 100 documents, to optimize performance. + ## Version 0.1.1 changed - Moves the logic for monitoring the extension's workload from the existing HTTP function to a new Pub/Sub controllerCore function. Now, if called, the HTTP function triggers the new `controllerCore` function instead. This change was made to accommodate a [change in the way Google Cloud Functions handles HTTP functions](https://cloud.google.com/functions/docs/securing/managing-access#allowing_unauthenticated_function_invocation). diff --git a/firestore-counter/README.md b/firestore-counter/README.md index 80bb4e39e..d6011ff0a 100644 --- a/firestore-counter/README.md +++ b/firestore-counter/README.md @@ -24,7 +24,7 @@ Before installing this extension, make sure that you've [set up a Cloud Firestor After installing this extension, you'll need to: - Update your [database security rules](https://firebase.google.com/docs/rules). -- Set up a [scheduled function](https://firebase.google.com/docs/functions/schedule-functions) to regularly call the controller function, which is created by this extension and monitors the extension's workload. +- Set up a [Cloud Scheduler job](https://cloud.google.com/scheduler/docs/quickstart) to regularly call the controllerCore function, which is created by this extension. It works by either aggregating shards itself or scheduling and monitoring workers to aggregate shards. - Install the provided [Counter SDK](https://github.com/firebase/extensions/blob/master/firestore-counter/clients/web/src/index.ts) in your app. You can then use this library in your code to specify your document path and increment values. Detailed information for these post-installation tasks are provided after you install this extension. @@ -44,7 +44,7 @@ When you use Firebase Extensions, you're only charged for the underlying resourc **Configuration Parameters:** -* Deployment location: Where should the extension be deployed? You usually want a location close to your database. For help selecting a location, refer to the [location selection guide](https://firebase.google.com/docs/functions/locations). +* Cloud Functions location: Where do you want to deploy the functions created for this extension? You usually want a location close to your database. For help selecting a location, refer to the [location selection guide](https://firebase.google.com/docs/functions/locations). * Document path for internal state: What is the path to the document where the extension can keep its internal state? @@ -52,11 +52,13 @@ When you use Firebase Extensions, you're only charged for the underlying resourc **Cloud Functions:** -* **controller:** Scheduled to run every minute. This function either aggregates shards itself, or it schedules and monitors workers to aggregate shards. +* **controllerCore:** Scheduled to run every minute. This function either aggregates shards itself, or it schedules and monitors workers to aggregate shards. + +* **controller:** Maintained for backwards compatibility. This function relays a message to the extension's Pub/Sub topic to trigger the controllerCore function. * **onWrite:** Listens for changes on counter shards that may need aggregating. This function is limited to max 1 instance. -* **worker:** Monitors a range of shards and aggregates them, as needed. There may be 0 or more worker functions running at any point in time. The controller function is responsible for scheduling and monitoring these workers. +* **worker:** Monitors a range of shards and aggregates them, as needed. There may be 0 or more worker functions running at any point in time. The controllerCore function is responsible for scheduling and monitoring these workers. @@ -67,3 +69,5 @@ When you use Firebase Extensions, you're only charged for the underlying resourc This extension will operate with the following project IAM roles: * datastore.user (Reason: Allows the extension to aggregate Cloud Firestore counter shards.) + +* pubsub.publisher (Reason: Allows the HTTPS controller function to publish a message to the extension's Pub/Sub topic, which triggers the controllerCore function.) diff --git a/firestore-counter/extension.yaml b/firestore-counter/extension.yaml index 3a077208e..90f230736 100644 --- a/firestore-counter/extension.yaml +++ b/firestore-counter/extension.yaml @@ -15,7 +15,7 @@ name: firestore-counter displayName: Distributed Counter specVersion: v1beta -version: 0.1.1 +version: 0.1.2 description: Records event counters at scale to accommodate high-velocity writes to Cloud Firestore. @@ -96,11 +96,12 @@ resources: params: - param: LOCATION type: select - label: Deployment location + label: Cloud Functions location description: >- - Where should the extension be deployed? You usually want a location close to your database. - For help selecting a location, refer to the - [location selection guide](https://firebase.google.com/docs/functions/locations). + Where do you want to deploy the functions created for this extension? + You usually want a location close to your database. For help selecting a + location, refer to the [location selection + guide](https://firebase.google.com/docs/functions/locations). options: - label: Iowa (us-central1) value: us-central1 diff --git a/firestore-counter/functions/lib/worker.js b/firestore-counter/functions/lib/worker.js index 79f1aacfa..3321b0189 100644 --- a/firestore-counter/functions/lib/worker.js +++ b/firestore-counter/functions/lib/worker.js @@ -30,7 +30,7 @@ const planner_1 = require("./planner"); const aggregator_1 = require("./aggregator"); const firestore_1 = require("@google-cloud/firestore"); const uuid = require("uuid"); -const SHARDS_LIMIT = 499; +const SHARDS_LIMIT = 100; const WORKER_TIMEOUT_MS = 45000; /** * Worker is controlled by WorkerMetadata document stored at process.env.MODS_INTERNAL_COLLECTION diff --git a/firestore-send-email/README.md b/firestore-send-email/README.md index c23695ef1..cc90d7059 100644 --- a/firestore-send-email/README.md +++ b/firestore-send-email/README.md @@ -44,7 +44,7 @@ Usage of this extension also requires you to have SMTP credentials for mail deli **Configuration Parameters:** -* Deployment location: Where should the extension be deployed? You usually want a location close to your database. For help selecting a location, refer to the [location selection guide](https://firebase.google.com/docs/functions/locations). +* Cloud Functions location: Where do you want to deploy the functions created for this extension? You usually want a location close to your database. For help selecting a location, refer to the [location selection guide](https://firebase.google.com/docs/functions/locations). * SMTP connection URI: A URI representing an SMTP server that this extension can use to deliver email. diff --git a/firestore-send-email/extension.yaml b/firestore-send-email/extension.yaml index 0eae82aa7..13c263eb8 100644 --- a/firestore-send-email/extension.yaml +++ b/firestore-send-email/extension.yaml @@ -56,11 +56,12 @@ resources: params: - param: LOCATION type: select - label: Deployment location + label: Cloud Functions location description: >- - Where should the extension be deployed? You usually want a location close to your database. - For help selecting a location, refer to the - [location selection guide](https://firebase.google.com/docs/functions/locations). + Where do you want to deploy the functions created for this extension? + You usually want a location close to your database. For help selecting a + location, refer to the [location selection + guide](https://firebase.google.com/docs/functions/locations). options: - label: Iowa (us-central1) value: us-central1 diff --git a/firestore-shorten-urls-bitly/README.md b/firestore-shorten-urls-bitly/README.md index fed386a0f..82e6b559e 100644 --- a/firestore-shorten-urls-bitly/README.md +++ b/firestore-shorten-urls-bitly/README.md @@ -37,7 +37,7 @@ Usage of this extension also requires you to have a Bit.ly account. You are resp **Configuration Parameters:** -* Deployment location: Where should the extension be deployed? You usually want a location close to your database. For help selecting a location, refer to the [location selection guide](https://firebase.google.com/docs/functions/locations). +* Cloud Functions location: Where do you want to deploy the functions created for this extension? You usually want a location close to your database. For help selecting a location, refer to the [location selection guide](https://firebase.google.com/docs/functions/locations). * Bitly access token: What is your Bitly access token? Generate this access token using [Bitly](https://bitly.com/a/oauth_apps). diff --git a/firestore-shorten-urls-bitly/extension.yaml b/firestore-shorten-urls-bitly/extension.yaml index 9b645ac68..78928bfde 100644 --- a/firestore-shorten-urls-bitly/extension.yaml +++ b/firestore-shorten-urls-bitly/extension.yaml @@ -57,11 +57,12 @@ resources: params: - param: LOCATION type: select - label: Deployment location + label: Cloud Functions location description: >- - Where should the extension be deployed? You usually want a location close to your database. - For help selecting a location, refer to the - [location selection guide](https://firebase.google.com/docs/functions/locations). + Where do you want to deploy the functions created for this extension? + You usually want a location close to your database. For help selecting a + location, refer to the [location selection + guide](https://firebase.google.com/docs/functions/locations). options: - label: Iowa (us-central1) value: us-central1 diff --git a/firestore-translate-text/README.md b/firestore-translate-text/README.md index 854ac5cae..8bc673ad5 100644 --- a/firestore-translate-text/README.md +++ b/firestore-translate-text/README.md @@ -34,7 +34,7 @@ When you use Firebase Extensions, you're only charged for the underlying resourc **Configuration Parameters:** -* Deployment location: Where should the extension be deployed? You usually want a location close to your database. For help selecting a location, refer to the [location selection guide](https://firebase.google.com/docs/functions/locations). +* Cloud Functions location: Where do you want to deploy the functions created for this extension? You usually want a location close to your database. For help selecting a location, refer to the [location selection guide](https://firebase.google.com/docs/functions/locations). * Target languages for translations, as a comma-separated list: Into which target languages do you want to translate new strings? The languages are identified using ISO-639-1 codes in a comma-separated list, for example: en,es,de,fr. For these codes, visit the [supported languages list](https://cloud.google.com/translate/docs/languages). diff --git a/firestore-translate-text/extension.yaml b/firestore-translate-text/extension.yaml index 1d2b68c22..560c9deea 100644 --- a/firestore-translate-text/extension.yaml +++ b/firestore-translate-text/extension.yaml @@ -59,11 +59,12 @@ resources: params: - param: LOCATION type: select - label: Deployment location + label: Cloud Functions location description: >- - Where should the extension be deployed? You usually want a location close to your database. - For help selecting a location, refer to the - [location selection guide](https://firebase.google.com/docs/functions/locations). + Where do you want to deploy the functions created for this extension? + You usually want a location close to your database. For help selecting a + location, refer to the [location selection + guide](https://firebase.google.com/docs/functions/locations). options: - label: Iowa (us-central1) value: us-central1 diff --git a/rtdb-limit-child-nodes/README.md b/rtdb-limit-child-nodes/README.md index ac7d3119f..45b32d9f5 100644 --- a/rtdb-limit-child-nodes/README.md +++ b/rtdb-limit-child-nodes/README.md @@ -26,7 +26,7 @@ When you use Firebase Extensions, you're only charged for the underlying resourc **Configuration Parameters:** -* Deployment location: Where should the extension be deployed? You usually want a location close to your database. Realtime Database instances are located in us-central1. For help selecting a location, refer to the [location selection guide](https://firebase.google.com/docs/functions/locations). +* Cloud Functions location: Where do you want to deploy the functions created for this extension? You usually want a location close to your database. Realtime Database instances are located in `us-central1`. For help selecting a location, refer to the [location selection guide](https://firebase.google.com/docs/functions/locations). * Realtime Database path: What is the Realtime Database path for which you want to limit the number of child nodes? diff --git a/rtdb-limit-child-nodes/extension.yaml b/rtdb-limit-child-nodes/extension.yaml index 92f5e8ced..db488887a 100644 --- a/rtdb-limit-child-nodes/extension.yaml +++ b/rtdb-limit-child-nodes/extension.yaml @@ -54,12 +54,13 @@ resources: params: - param: LOCATION type: select - label: Deployment location + label: Cloud Functions location description: >- - Where should the extension be deployed? You usually want a location close to your database. - Realtime Database instances are located in us-central1. - For help selecting a location, refer to the - [location selection guide](https://firebase.google.com/docs/functions/locations). + Where do you want to deploy the functions created for this extension? + You usually want a location close to your database. Realtime Database + instances are located in `us-central1`. For help selecting a + location, refer to the [location selection + guide](https://firebase.google.com/docs/functions/locations). options: - label: Iowa (us-central1) value: us-central1 diff --git a/storage-resize-images/README.md b/storage-resize-images/README.md index 74e308284..37f622a86 100644 --- a/storage-resize-images/README.md +++ b/storage-resize-images/README.md @@ -40,7 +40,7 @@ When you use Firebase Extensions, you're only charged for the underlying resourc **Configuration Parameters:** -* Deployment location: Where should the extension be deployed? You usually want a location close to your Storage bucket. For help selecting a location, refer to the [location selection guide](https://firebase.google.com/docs/functions/locations). +* Cloud Functions location: Where do you want to deploy the functions created for this extension? You usually want a location close to your Storage bucket. For help selecting a location, refer to the [location selection guide](https://firebase.google.com/docs/functions/locations). * Cloud Storage bucket for images: To which Cloud Storage bucket will you upload images that you want to resize? Resized images will be stored in this bucket. Depending on your extension configuration, original images are either kept or deleted. diff --git a/storage-resize-images/extension.yaml b/storage-resize-images/extension.yaml index ec3ceed96..341ba2f29 100644 --- a/storage-resize-images/extension.yaml +++ b/storage-resize-images/extension.yaml @@ -65,11 +65,12 @@ resources: params: - param: LOCATION type: select - label: Deployment location + label: Cloud Functions location description: >- - Where should the extension be deployed? You usually want a location close to your Storage bucket. - For help selecting a location, refer to the - [location selection guide](https://firebase.google.com/docs/functions/locations). + Where do you want to deploy the functions created for this extension? + You usually want a location close to your Storage bucket. For help selecting a + location, refer to the [location selection + guide](https://firebase.google.com/docs/functions/locations). options: - label: Iowa (us-central1) value: us-central1