diff --git a/auth-mailchimp-sync/README.md b/auth-mailchimp-sync/README.md index 88406b86d..941dd5892 100644 --- a/auth-mailchimp-sync/README.md +++ b/auth-mailchimp-sync/README.md @@ -35,7 +35,7 @@ Usage of this extension also requires you to have a Mailchimp account. You are r **Configuration Parameters:** -* Deployment location: Where should the extension be deployed? For help selecting a location, refer to the [location selection guide](https://firebase.google.com/docs/functions/locations). +* Cloud Functions location: Where do you want to deploy the functions created for this extension? * Mailchimp API key: What is your Mailchimp API key? To obtain a Mailchimp API key, go to your [Mailchimp account](https://admin.mailchimp.com/account/api/). diff --git a/auth-mailchimp-sync/extension.yaml b/auth-mailchimp-sync/extension.yaml index e019d9f25..31455b57b 100644 --- a/auth-mailchimp-sync/extension.yaml +++ b/auth-mailchimp-sync/extension.yaml @@ -67,11 +67,9 @@ resources: params: - param: LOCATION type: select - label: Deployment location + label: Cloud Functions location description: >- - Where should the extension be deployed? For help selecting a location, - refer to the [location selection - guide](https://firebase.google.com/docs/functions/locations). + Where do you want to deploy the functions created for this extension? options: - label: Iowa (us-central1) value: us-central1 diff --git a/delete-user-data/CHANGELOG.md b/delete-user-data/CHANGELOG.md index 9f24a38b4..cb84dbae6 100644 --- a/delete-user-data/CHANGELOG.md +++ b/delete-user-data/CHANGELOG.md @@ -1,3 +1,7 @@ +## Version 0.1.3 + +feature - Support deletion of directories (issue #148). + ## Version 0.1.2 feature - Add a new param for recursively deleting subcollections in Cloud Firestore (issue #14). diff --git a/delete-user-data/extension.yaml b/delete-user-data/extension.yaml index 3cae0e07c..edb8b2951 100644 --- a/delete-user-data/extension.yaml +++ b/delete-user-data/extension.yaml @@ -15,11 +15,11 @@ name: delete-user-data displayName: Delete User Data specVersion: v1beta -version: 0.1.2 +version: 0.1.3 description: - Deletes data keyed on a userId from Cloud Firestore, Realtime - Database, and/or Cloud Storage when a user deletes their account. + Deletes data keyed on a userId from Cloud Firestore, Realtime Database, and/or + Cloud Storage when a user deletes their account. license: Apache-2.0 billingRequired: false @@ -52,9 +52,10 @@ resources: - name: clearData type: firebaseextensions.v1beta.function description: - Listens for user accounts to be deleted from your project's authenticated users, - then removes any associated user data (based on Firebase Authentication's User ID) from - Realtime Database, Cloud Firestore, and/or Cloud Storage. + Listens for user accounts to be deleted from your project's authenticated + users, then removes any associated user data (based on Firebase + Authentication's User ID) from Realtime Database, Cloud Firestore, and/or + Cloud Storage. properties: sourceDirectory: . location: ${LOCATION} @@ -65,11 +66,12 @@ resources: params: - param: LOCATION type: select - label: Deployment location + label: Cloud Functions location description: >- - Where should the extension be deployed? You usually want a location close to your database. - For help selecting a location, refer to the - [location selection guide](https://firebase.google.com/docs/functions/locations). + Where do you want to deploy the functions created for this extension? + You usually want a location close to your database or Storage bucket. + For help selecting a location, refer to the [location selection + guide](https://firebase.google.com/docs/functions/locations). options: - label: Iowa (us-central1) value: us-central1 @@ -95,21 +97,23 @@ params: example: users/{UID},admins/{UID} required: false description: >- - Which paths in your Cloud Firestore instance contain user data? Leave empty if - you don't use Cloud Firestore. + Which paths in your Cloud Firestore instance contain user data? Leave + empty if you don't use Cloud Firestore. - Enter the full paths, separated by commas. You can represent the User ID of the deleted user with `{UID}`. + Enter the full paths, separated by commas. You can represent the User ID + of the deleted user with `{UID}`. - For example, if you have the collections `users` and `admins`, and each collection - has documents with User ID as document IDs, then you can enter `users/{UID},admins/{UID}`. + For example, if you have the collections `users` and `admins`, and each + collection has documents with User ID as document IDs, then you can enter + `users/{UID},admins/{UID}`. - param: FIRESTORE_DELETE_MODE type: select label: Cloud Firestore delete mode description: >- - (Only applicable if you use the `Cloud Firestore paths` parameter.) How do you want - to delete Cloud Firestore documents? To also delete documents in subcollections, - set this parameter to `recursive`. + (Only applicable if you use the `Cloud Firestore paths` parameter.) How do + you want to delete Cloud Firestore documents? To also delete documents in + subcollections, set this parameter to `recursive`. options: - label: Recursive value: recursive @@ -124,10 +128,11 @@ params: example: users/{UID},admins/{UID} required: false description: >- - Which paths in your Realtime Database instance contain user data? Leave empty if you - don't use Realtime Database. + Which paths in your Realtime Database instance contain user data? Leave + empty if you don't use Realtime Database. - Enter the full paths, separated by commas. You can represent the User ID of the deleted user with `{UID}`. + Enter the full paths, separated by commas. You can represent the User ID + of the deleted user with `{UID}`. For example: `users/{UID},admins/{UID}`. @@ -140,12 +145,14 @@ params: Where in Google Cloud Storage do you store user data? Leave empty if you don't use Cloud Storage. - Enter the full paths, separated by commas. You can represent the User ID of the deleted user with `{UID}`. - You can use `{DEFAULT}` to represent your default bucket. - - For example, if you are using your default bucket, - and the bucket has files with the naming scheme `{UID}-pic.png`, - then you can enter `{DEFAULT}/{UID}-pic.png`. - If you also have files in another bucket called `my-awesome-app-logs`, - and that bucket has files with the naming scheme `{UID}-logs.txt`, - then you can enter `{DEFAULT}/{UID}-pic.png,my-awesome-app-logs/{UID}-logs.txt`. + Enter the full paths to files or directories in your Storage buckets, + separated by commas. Use `{UID}` to represent the User ID of the deleted + user, and use `{DEFAULT}` to represent your default Storage bucket. + + Here's a series of examples. To delete all the files in your default + bucket with the file naming scheme `{UID}-pic.png`, enter + `{DEFAULT}/{UID}-pic.png`. To also delete all the files in another bucket + called my-app-logs with the file naming scheme `{UID}-logs.txt`, enter + `{DEFAULT}/{UID}-pic.png,my-app-logs/{UID}-logs.txt`. To *also* delete a User + ID-labeled directory and all its files (like `media/{UID}`), enter + `{DEFAULT}/{UID}-pic.png,my-app-logs/{UID}-logs.txt,{DEFAULT}/media/{UID}`. diff --git a/delete-user-data/functions/lib/index.js b/delete-user-data/functions/lib/index.js index 5c20b7d6b..c24f1827e 100644 --- a/delete-user-data/functions/lib/index.js +++ b/delete-user-data/functions/lib/index.js @@ -91,19 +91,20 @@ const clearStorageData = (storagePaths, uid) => __awaiter(void 0, void 0, void 0 const bucket = bucketName === "{DEFAULT}" ? admin.storage().bucket() : admin.storage().bucket(bucketName); - const file = bucket.file(parts.slice(1).join("/")); - const bucketFilePath = `${bucket.name}/${file.name}`; + const prefix = parts.slice(1).join("/"); try { - logs.storagePathDeleting(bucketFilePath); - yield file.delete(); - logs.storagePathDeleted(bucketFilePath); + logs.storagePathDeleting(prefix); + yield bucket.deleteFiles({ + prefix, + }); + logs.storagePathDeleted(prefix); } catch (err) { if (err.code === 404) { - logs.storagePath404(bucketFilePath); + logs.storagePath404(prefix); } else { - logs.storagePathError(bucketFilePath, err); + logs.storagePathError(prefix, err); } } })); diff --git a/delete-user-data/functions/src/index.ts b/delete-user-data/functions/src/index.ts index 1d30786b6..e6c96c3e3 100644 --- a/delete-user-data/functions/src/index.ts +++ b/delete-user-data/functions/src/index.ts @@ -92,17 +92,18 @@ const clearStorageData = async (storagePaths: string, uid: string) => { bucketName === "{DEFAULT}" ? admin.storage().bucket() : admin.storage().bucket(bucketName); - const file = bucket.file(parts.slice(1).join("/")); - const bucketFilePath = `${bucket.name}/${file.name}`; + const prefix = parts.slice(1).join("/"); try { - logs.storagePathDeleting(bucketFilePath); - await file.delete(); - logs.storagePathDeleted(bucketFilePath); + logs.storagePathDeleting(prefix); + await bucket.deleteFiles({ + prefix, + }); + logs.storagePathDeleted(prefix); } catch (err) { if (err.code === 404) { - logs.storagePath404(bucketFilePath); + logs.storagePath404(prefix); } else { - logs.storagePathError(bucketFilePath, err); + logs.storagePathError(prefix, err); } } }); diff --git a/firestore-bigquery-export/POSTINSTALL.md b/firestore-bigquery-export/POSTINSTALL.md index b76d78c0c..fc3cebb4b 100644 --- a/firestore-bigquery-export/POSTINSTALL.md +++ b/firestore-bigquery-export/POSTINSTALL.md @@ -13,13 +13,15 @@ You can test out this extension right away: 1. Query your **raw changelog table**, which should contain a single log of creating the `bigquery-mirror-test` document. ``` - SELECT * FROM `${param:PROJECT_ID}.${param:DATASET_ID}.${param:TABLE_ID}_raw_changelog` + SELECT * + FROM `${param:PROJECT_ID}.${param:DATASET_ID}.${param:TABLE_ID}_raw_changelog` ``` 1. Query your **latest view**, which should return the latest change event for the only document present -- `bigquery-mirror-test`. ``` - SELECT * FROM `${param:PROJECT_ID}.${param:DATASET_ID}.${param:TABLE_ID}_raw_latest` + SELECT * + FROM `${param:PROJECT_ID}.${param:DATASET_ID}.${param:TABLE_ID}_raw_latest` ``` 1. Delete the `bigquery-mirror-test` document from [Cloud Firestore](https://console.firebase.google.com/project/${param:PROJECT_ID}/database/firestore/data). @@ -28,9 +30,10 @@ The `bigquery-mirror-test` document will disappear from the **latest view** and 1. You can check the changelogs of a single document with this query: ``` - SELECT * FROM `${param:PROJECT_ID}.${param:DATASET_ID}.${param:TABLE_ID}_raw_changelog` - WHERE document_name = "bigquery-mirror-test" - ORDER BY TIMESTAMP ASC + SELECT * + FROM `${param:PROJECT_ID}.${param:DATASET_ID}.${param:TABLE_ID}_raw_changelog` + WHERE document_name = "bigquery-mirror-test" + ORDER BY TIMESTAMP ASC ``` ### Using the extension @@ -48,13 +51,17 @@ Note that this extension only listens for _document_ changes in the collection, This extension only sends the content of documents that have been changed -- it does not export your full dataset of existing documents into BigQuery. So, to backfill your BigQuery dataset with all the documents in your collection, you can run the import script provided by this extension. -The import script can read all existing documents in a Cloud Firestore collection and insert them into the raw changelog table created by this extension. The script adds a special changelog for each document with the operation of `IMPORT` and the timestamp of epoch. This is to ensure that any operation on an imported document supersedes the `IMPORT` +The import script can read all existing documents in a Cloud Firestore collection and insert them into the raw changelog table created by this extension. The script adds a special changelog for each document with the operation of `IMPORT` and the timestamp of epoch. This is to ensure that any operation on an imported document supersedes the `IMPORT`. -**Important:** Run the script over the entire collection _after_ installing this extension, otherwise all writes to your database during the import might be lost. +**Important:** Run the import script over the entire collection _after_ installing this extension, otherwise all writes to your database during the import might be lost. -You may pause and resume the script from the last batch at any point. +Learn more about using the import script to [backfill your existing collection](https://github.com/firebase/extensions/blob/master/firestore-bigquery-export/guides/IMPORT_EXISTING_DOCUMENTS.md). -Learn more about using this script to [backfill your existing collection](https://github.com/firebase/extensions/blob/master/firestore-bigquery-export/guides/IMPORT_EXISTING_DOCUMENTS.md). +### _(Optional)_ Generate schema views + +After your data is in BigQuery, you can use the schema-views script (provided by this extension) to create views that make it easier to query relevant data. You only need to provide a JSON schema file that describes your data structure, and the schema-views script will create the views. + +Learn more about using the schema-views script to [generate schema views](https://github.com/firebase/extensions/blob/master/firestore-bigquery-export/guides/GENERATE_SCHEMA_VIEWS.md). ### Monitoring diff --git a/firestore-bigquery-export/PREINSTALL.md b/firestore-bigquery-export/PREINSTALL.md index 6eabf718d..8b94b2f73 100644 --- a/firestore-bigquery-export/PREINSTALL.md +++ b/firestore-bigquery-export/PREINSTALL.md @@ -16,11 +16,15 @@ Before installing this extension, you'll need to: + [Set up Cloud Firestore in your Firebase project.](https://firebase.google.com/docs/firestore/quickstart) + [Link your Firebase project to BigQuery.](https://support.google.com/firebase/answer/6318765) -This extension only sends the content of documents that have been changed -- it does not export your full dataset of existing documents into BigQuery. So, to backfill your BigQuery dataset with all the documents in your collection, you can run the import script provided by this extension. +#### Backfill your BigQuery dataset -**Important:** Run the script over the entire collection _after_ installing this extension, otherwise all writes to your database during the import might be lost. +This extension only sends the content of documents that have been changed -- it does not export your full dataset of existing documents into BigQuery. So, to backfill your BigQuery dataset with all the documents in your collection, you can run the [import script](https://github.com/firebase/extensions/blob/master/firestore-bigquery-export/guides/IMPORT_EXISTING_DOCUMENTS.md) provided by this extension. -Learn more about using this script to [backfill your existing collection](https://github.com/firebase/extensions/blob/master/firestore-bigquery-export/guides/IMPORT_EXISTING_DOCUMENTS.md). +**Important:** Run the import script over the entire collection _after_ installing this extension, otherwise all writes to your database during the import might be lost. + +#### Generate schema views + +After your data is in BigQuery, you can run the [schema-views script](https://github.com/firebase/extensions/blob/master/firestore-bigquery-export/guides/GENERATE_SCHEMA_VIEWS.md) (provided by this extension) to create views that make it easier to query relevant data. You only need to provide a JSON schema file that describes your data structure, and the schema-views script will create the views. #### Billing diff --git a/firestore-bigquery-export/README.md b/firestore-bigquery-export/README.md index a149c02f1..48c6b0c39 100644 --- a/firestore-bigquery-export/README.md +++ b/firestore-bigquery-export/README.md @@ -22,11 +22,15 @@ Before installing this extension, you'll need to: + [Set up Cloud Firestore in your Firebase project.](https://firebase.google.com/docs/firestore/quickstart) + [Link your Firebase project to BigQuery.](https://support.google.com/firebase/answer/6318765) -This extension only sends the content of documents that have been changed -- it does not export your full dataset of existing documents into BigQuery. So, to backfill your BigQuery dataset with all the documents in your collection, you can run the import script provided by this extension. +#### Backfill your BigQuery dataset -**Important:** Run the script over the entire collection _after_ installing this extension, otherwise all writes to your database during the import might be lost. +This extension only sends the content of documents that have been changed -- it does not export your full dataset of existing documents into BigQuery. So, to backfill your BigQuery dataset with all the documents in your collection, you can run the [import script](https://github.com/firebase/extensions/blob/master/firestore-bigquery-export/guides/IMPORT_EXISTING_DOCUMENTS.md) provided by this extension. -Learn more about using this script to [backfill your existing collection](https://github.com/firebase/extensions/blob/master/firestore-bigquery-export/guides/IMPORT_EXISTING_DOCUMENTS.md). +**Important:** Run the import script over the entire collection _after_ installing this extension, otherwise all writes to your database during the import might be lost. + +#### Generate schema views + +After your data is in BigQuery, you can run the [schema-views script](https://github.com/firebase/extensions/blob/master/firestore-bigquery-export/guides/GENERATE_SCHEMA_VIEWS.md) (provided by this extension) to create views that make it easier to query relevant data. You only need to provide a JSON schema file that describes your data structure, and the schema-views script will create the views. #### Billing @@ -43,7 +47,7 @@ When you use Firebase Extensions, you're only charged for the underlying resourc **Configuration Parameters:** -* Deployment location: Where should the extension be deployed? You usually want a location close to your database. For help selecting a location, refer to the [location selection guide](https://firebase.google.com/docs/functions/locations). +* Cloud Functions location: Where do you want to deploy the functions created for this extension? You usually want a location close to your database. For help selecting a location, refer to the [location selection guide](https://firebase.google.com/docs/functions/locations). Note that this extension locates your BigQuery dataset in `us-central1`. * Collection path: What is the path of the collection that you would like to export? You may use `{wildcard}` notation to match a subcollection of all documents in a collection (for example: `chatrooms/{chatid}/posts`). diff --git a/firestore-bigquery-export/extension.yaml b/firestore-bigquery-export/extension.yaml index 493574549..b919270fa 100644 --- a/firestore-bigquery-export/extension.yaml +++ b/firestore-bigquery-export/extension.yaml @@ -58,11 +58,13 @@ resources: params: - param: LOCATION type: select - label: Deployment location + label: Cloud Functions location description: >- - Where should the extension be deployed? You usually want a location close to your database. - For help selecting a location, refer to the - [location selection guide](https://firebase.google.com/docs/functions/locations). + Where do you want to deploy the functions created for this extension? + You usually want a location close to your database. For help selecting a + location, refer to the [location selection + guide](https://firebase.google.com/docs/functions/locations). + Note that this extension locates your BigQuery dataset in `us-central1`. options: - label: Iowa (us-central1) value: us-central1 diff --git a/firestore-bigquery-export/guides/EXAMPLE_QUERIES.md b/firestore-bigquery-export/guides/EXAMPLE_QUERIES.md new file mode 100644 index 000000000..210d251d0 --- /dev/null +++ b/firestore-bigquery-export/guides/EXAMPLE_QUERIES.md @@ -0,0 +1,110 @@ +These example queries are for use with the official Firebase Extension +[_Export Collections to BigQuery_](https://github.com/firebase/extensions/tree/master/firestore-bigquery-export) +and its associated [`fs-bq-schema-views` script](https://github.com/firebase/extensions/blob/master/firestore-bigquery-export/guides/GENERATE_SCHEMA_VIEWS.md) (referred to as the "schema-views script"). + +The queries use the following parameter values from your installation of the extension: + +- `${param:PROJECT_ID}`: the project ID for the Firebase project in + which you installed the extension +- `${param:DATASET_ID}`: the ID that you specified for your dataset during + extension installation +- `${param:TABLE_ID}`: the common prefix of BigQuery views to generate + +**Note:** You can, at any time, run the schema-views script against additional schema files +to create different schema views over your raw changelog. When you settle on a fixed schema, +you can create a [scheduled query](https://cloud.google.com/bigquery/docs/scheduling-queries) +to transfer the columns reported by the schema view to a persistent backup table. + +Assume that you have a schema view matching the following configuration from a +schema file: + +``` +{ + "fields": [ + { + "name": "name", + "type": "string" + }, + { + "name":"favorite_numbers", + "type": "array" + }, + { + "name": "last_login", + "type": "timestamp" + }, + { + "name": "last_location", + "type": "geopoint" + }, + { + "fields": [ + { + "name": "name", + "type": "string" + } + ], + "name": "friends", + "type": "map" + } + ] +} +``` + +### Example query for a timestamp + +You can generate a listing of users that have logged in to the app as follows: + +``` +SELECT name, last_login +FROM ${param:PROJECT_ID}.${param:DATASET_ID}.${param:TABLE_ID}_schema_${SCHEMA_FILE_NAME}_latest +ORDER BY last_login DESC +``` + +In this query, note the following: + +- `${SCHEMA_FILE_NAME}` is the name of the schema file that you + provided as an argument to run the schema-views script. + +- The `last_login` column contains data that is stored in the `data` + column of the raw changelog. The type conversion and view generation is + performed for you by the + [_Export Collections to BigQuery_](https://github.com/firebase/extensions/tree/master/firestore-bigquery-export) + extension. + +### Example queries for an array + +The example schema configuration (see above) stores each user's favorite number +in a Cloud Firestore array called `favorite_numbers`. Here are some example +queries for that data: + +- If you wanted to determine how many favorite numbers each user + currently has, then you can run the following query: + + ``` + SELECT document_name, MAX(favorite_numbers_index) + FROM ${param:PROJECT_ID}.users.users_schema_user_full_schema_latest + GROUP BY document_name + ``` + +- If you wanted to determine the what the current favorite numbers are + of the app's users (assuming that number is stored in the first position of + the `favorite_numbers` array), you can run the following query: + + ``` + SELECT document_name, favorite_numbers_member + FROM ${param:PROJECT_ID}.users.users_schema_user_full_schema_latest + WHERE favorite_numbers_index = 0 + ``` + +### Example query if you have multiple arrays + +If you had multiple arrays in the schema configuration, you might have to select +all `DISTINCT` documents to eliminate the redundant rows introduced by the +cartesian product of `CROSS JOIN`. + +``` +SELECT DISTINCT document_name, favorite_numbers_member +FROM ${param:PROJECT_ID}.users.users_schema_user_full_schema_latest +WHERE favorite_numbers_index = 0 +``` diff --git a/firestore-bigquery-export/guides/GENERATE_SCHEMA_VIEWS.md b/firestore-bigquery-export/guides/GENERATE_SCHEMA_VIEWS.md new file mode 100644 index 000000000..757d2cdbf --- /dev/null +++ b/firestore-bigquery-export/guides/GENERATE_SCHEMA_VIEWS.md @@ -0,0 +1,379 @@ +The `fs-bq-schema-views` script is for use with the official Firebase Extension +[_Export Collections to BigQuery_](https://github.com/firebase/extensions/tree/master/firestore-bigquery-export). + +## Overview + +The `fs-bq-schema-views` script (referred to as the "schema-views script") +generates richly-typed BigQuery views of your raw changelog. + +The _Export Collections to BigQuery_ extension only mirrors raw data, but it +doesn't apply schemas or types. This decoupling makes schema validation less +risky because no data can be lost due to schema mismatch or unknown fields. + +The schema-views script creates a BigQuery view, based on a JSON schema +configuration file, using +[BigQuery's built-in JSON functions](https://cloud.google.com/bigquery/docs/reference/standard-sql/json_functions). +The _Export Collections to BigQuery_ extension also provides some BigQuery +[user-defined functions](https://github.com/firebase/extensions/blob/master/firestore-bigquery-export/scripts/gen-schema-view/src/udf.ts) +that are helpful in converting Firestore document properties to richly-typed +BigQuery cells. + +## Use the script + +The following steps are an example of how to use the schema-views script. In the +sections at the end of this file, you can review detailed information about +configuring a schema file and reviewing the resulting schema views. + +### Step 1: Create a schema file + +The schema-views script runs against "schema files" which specify your schema +configurations in a JSON format. + +In any directory, create a schema file called `test_schema.json` +that contains the following: + +``` +{ + "fields": [ + { + "name": "name", + "type": "string" + }, + { + "name": "age", + "type": "number" + } + ] +} +``` + +Learn [How to configure schema files](#how-to-configure-schema-files) +later in this guide. + +### Step 2: Set up credentials + +The schema-views script uses Application Default Credentials to communicate with +BigQuery. + +One way to set up these credentials is to run the following command using the +[gcloud](https://cloud.google.com/sdk/gcloud/) CLI: + +``` +$ gcloud auth application-default login +``` + +Alternatively, you can +[create and use a service account](https://cloud.google.com/docs/authentication/production#obtaining_and_providing_service_account_credentials_manually). +This service account must be assigned a role that grants the permission of +`bigquery.jobs.create`, like the ["BigQuery Job User" role](https://cloud.google.com/iam/docs/understanding-roles#bigquery-roles). + +### Step 3: Run the script + +The schema-views script uses the following parameter values from your +installation of the extension: + +- `${param:PROJECT_ID}`: the project ID for the Firebase project in + which you installed the extension +- `${param:DATASET_ID}`: the ID that you specified for your dataset during + extension installation +- `${param:TABLE_ID}`: the common prefix of BigQuery views to generate + +Run the schema-views script using +[`npx` (the Node Package Runner)](https://github.com/npm/npx#npx1----execute-npm-package-binaries) +via `npm` (the Node Package Manager). + +1. Make sure that you've installed the required tools to run the + schema-views script: + + - To access the `npm` command tools, you need to install + [Node.js](https://www.nodejs.org/). + - If you use npm v5.1 or earlier, you need to explicitly install `npx`. + Run `npm install --global npx`. + +1. Run the schema-views script via `npx` by running the following command: + + ``` + $ npx @firebaseextensions/fs-bq-schema-views \ + --non-interactive \ + --project=${param:PROJECT_ID} \ + --dataset=${param:DATASET_ID} \ + --table-name-prefix=${param:TABLE_ID} \ + --schema-files=./test_schema.json + ``` + + **Note:** You can run the schema-views script from any directory, but + you need to specify the path to your schema file using the `--schema-files` + flag. To run the schema-views script against multiple schema files, specify + each file in a comma-separated list + (for example: `--schema-files=./test_schema.json,./test_schema2.json`). + +### Step 4: View results + +1. In the [BigQuery web UI](https://console.cloud.google.com/bigquery), + navigate to the generated schema changelog view: + `https://console.cloud.google.com/bigquery?project=${param:PROJECT_ID}&p=${param:PROJECT_ID}&d=${param:DATASET_ID}&t=${param:TABLE_ID}_schema_test_schema_changelog&page=table`. + + This view allows you to query document change events by fields specified in + the schema. + +1. In the [Firebase console](https://console.firebase.google.com/), + go to the Cloud Firestore section, + then create a document called `test-schema-document` with two fields: + + - A field of type `string` called "name" + - A field of type `number` called "age" + +1. Back in BigQuery, run the following query in the schema changelog + view (that is, `https://console.cloud.google.com/bigquery?project=${param:PROJECT_ID}&p=${param:PROJECT_ID}&d=${param:DATASET_ID}&t=${param:TABLE_ID}_schema_test_schema_changelog&page=table`): + + ``` + SELECT document_name, name, age + FROM ${param:PROJECT_ID}.${param:DATASET_ID}.${param:TABLE_ID}_schema_test_schema_changelog + WHERE document_name = "test-schema-document" + ``` + +1. Go back to the Cloud Firestore section of the console, then change + the type of the "age" field to be a string. + +1. Back in BigQuery, re-run the following query: + + ``` + SELECT document_name, name, age + FROM ${param:PROJECT_ID}.${param:DATASET_ID}.${param:TABLE_ID}_schema_test_schema_changelog + WHERE document_name = "test-schema-document" + ``` + + You'll see a new change with a `null` age column. If you query documents + that don't match the schema, then the view contains null values for the + corresponding schema fields. + +1. Back in the Cloud Firestore section in the console, delete + `test-schema-document`. + +1. _(Optional)_ As with the raw views, you can also query events on the + view of the documents currently in the collection by using the latest + schema view + (that is, `https://console.cloud.google.com/bigquery?project=${param:PROJECT_ID}&p=${param:PROJECT_ID}&d=${param:DATASET_ID}&t=${param:COLLECTION_PATH}_schema_test_schema_latest&page=table`. + + Back in BigQuery, if you run the following query, you'll receive no + results because the document no longer exists in the Cloud Firestore + collection. + + ``` + SELECT document_name, name, age + FROM ${param:PROJECT_ID}.${param:DATASET_ID}.${param:TABLE_ID}_schema_test_schema_latest + WHERE document_name = "test-schema-document" + ``` + +### Next Steps + +- [Create your own schema files](#how-to-configure-schema-files) +- [Troubleshoot common issues](#common-schema-file-configuration-mistakes) +- [Learn about the columns in a schema view](#columns-in-a-schema-view) +- [Take a look at more SQL examples](https://github.com/firebase/extensions/blob/master/firestore-bigquery-export/guides/EXAMPLE_QUERIES.md) + +## How to configure schema files + +To generate schema views of your raw changelog, you must create at least one +schema JSON file. + +Here's an example of a configuration that a schema file might contain: + +``` +{ + "fields": [ + { + "name": "name", + "type": "string" + }, + { + "name":"favorite_numbers", + "type": "array" + }, + { + "name": "last_login", + "type": "timestamp" + }, + { + "name": "last_location", + "type": "geopoint" + }, + { + "fields": [ + { + "name": "name", + "type": "string" + } + ], + "name": "friends", + "type": "map" + } + ] +} +``` + +The root of the configuration must have a `fields` array that contains objects +which describe the elements in the schema. If one of the objects is of type +`map`, it must specify its own `fields` array describing the members of that +map. + +Each `fields` array must contain _at least one_ of the following types: + +- `string` +- `array` +- `map` +- `boolean` +- `number` +- `timestamp` +- `geopoint` +- `reference` +- `null` + +These types correspond with Cloud Firestore's +[supported data types](https://firebase.google.com/docs/firestore/manage-data/data-types). +Make sure that the types that you specify match the types of the fields in your +Cloud Firestore collection. + +You may create any number of schema files to use with the schema-views script. +The schema-views script generates the following views for _each_ schema file: + +- `${param:PROJECT_ID}.${param:DATASET_ID}.${param:TABLE_ID}_schema_${SCHEMA_FILE_NAME}_changelog` +- `${param:PROJECT_ID}.${param:DATASET_ID}.${param:TABLE_ID}_schema_${SCHEMA_FILE_NAME}_latest` + +Here, `${SCHEMA_FILE_NAME}` is the name of each schema file that you provided as +an argument to run the schema-views script. + +### Common schema file configuration mistakes + +Be aware of the following common mistakes when configuring a schema file: + +
Mistake in schema file config | +Outcome of mistake | +
---|---|
Omitting a relevant field | +The generated view will not contain a column for that field. | +
Specifying the wrong type for a relevant field | +Type conversion (see previous section) will fail and the resulting column
+ will contain a BigQuery null value in lieu of the desired
+ value. |
+
Specifying a schema field that doesn't exist in the underlying raw + changelog | +Querying the column for that field will return a BigQuery null
+ value instead of the desired value. |
+
Writing invalid JSON | +The schema-views script cannot generate a view | +
Cloud Firestore type | +BigQuery type | +
---|---|
string | +STRING | +
boolean | +BOOLEAN | +
number | +NUMERIC | +
timestamp | +TIMESTAMP | +
geopoint | +GEOGRAPHY | +
reference | +STRING (containing the path to the referenced document) |
+
null | +NULL | +