Skip to content

Commit

Permalink
Merge pull request #252 from alphagov/yacketysmackety-patch-4
Browse files Browse the repository at this point in the history
Update looker studio connection details
  • Loading branch information
yacketysmackety authored Oct 10, 2024
2 parents a2c2f4c + 67db369 commit b38e673
Showing 1 changed file with 14 additions and 4 deletions.
18 changes: 14 additions & 4 deletions source/analysis/govuk-ga4/use-ga4/looker-studio/index.html.md.erb
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: Looker Studio best practice
weight: 2
last_reviewed_on: 2024-06-08
last_reviewed_on: 2024-10-10
review_in: 6 months
---

Expand Down Expand Up @@ -69,6 +69,16 @@ If you are not GDS staff, we recommend you use a custom query. You can do this b

An example of a SQL query that could be used in step 5 above is:

#### Partitioned data source (the go-to GA4 connection)

```SQL
SELECT *
FROM `ga4-analytics-352613.flattened_dataset.partitioned_flattened_events`
WHERE event_date >= PARSE_DATE('%Y%m%d', @DS_START_DATE) and event_date <= PARSE_DATE('%Y%m%d', @DS_END_DATE)
```

#### Sharded data source (our legacy GA4 connection)

```SQL
SELECT *
FROM `ga4-analytics-352613.flattened_dataset.flattened_daily_ga_data_*`
Expand All @@ -78,13 +88,13 @@ WHERE _TABLE_SUFFIX BETWEEN @DS_START_DATE AND @DS_END_DATE

### Best practice

Where possible, use the flattened data source. The flattened tables are much more efficient to query, and should be easier to use as well.
Where possible, use the partitioned flattened data source. The partitioned flattened tables are more efficient to query than the sharded tables, and because the fields have been flattened, they are much easier to use than the nested raw data.

If you are connecting to BigQuery data in Looker Studio, make sure to define your date range.
If you would like to use a dynamic date range with BigQuery data and are using a custom query, please enable the date range parameters in the connection so that Looker Studio only queries the dates required.
To do this you will also need to include the date range parameters in your querry - for example, `WHERE _TABLE_SUFFIX BETWEEN @DS_START_DATE AND @DS_END_DATE`.
To do this you will also need to include the date range parameters in your querry - for example, with a our partitioned flattened GA4 tables (which have been partitioned on the event_date field), use `WHERE event_date >= PARSE_DATE('%Y%m%d', @DS_START_DATE) and event_date <= PARSE_DATE('%Y%m%d', @DS_END_DATE)`. If you are using a data source comprised of sharded tables, the syntax is slightly different, and you would qualify your query with something like: `WHERE _TABLE_SUFFIX BETWEEN @DS_START_DATE AND @DS_END_DATE`.

Use the ‘pause updates’ feature whilst creating new charts or carrying out a lot of edits on a report to minimise the number of queries run.

Avoid creating new connections to BigQuery datasets in Looker Studio as Looker Studio can query the entire dataset whilst connecting it to the Looker Studio dashboard, which can be very expensive.
Use shared connections where possible.
Use shared connections where possible.

0 comments on commit b38e673

Please sign in to comment.