Skip to content

Commit

Permalink
Added journey analytics schema to datalake doc
Browse files Browse the repository at this point in the history
  • Loading branch information
manikandansubramanian committed Aug 22, 2024
1 parent f741cba commit bf9f3fd
Show file tree
Hide file tree
Showing 2 changed files with 83 additions and 1 deletion.
84 changes: 83 additions & 1 deletion docs/datalake/epilot-datalake.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ sidebar_position: 1

# Epilot Datalake

Welcome to the documentation for our Data Lake feature, which serves as the centralized repository for real-time event streams of entity operations and snapshots of workflow executions data. This feature empowers users to access and analyze essential data generated by the 360 portal, including changes to entities such as orders, opportunities, contacts, accounts, products, and more.
Welcome to the documentation for our Data Lake feature, which serves as the centralized repository for real-time event streams of entity operations, snapshots of workflow executions and journey analytics data. This feature empowers users to access and analyze essential data generated by the 360 portal, including changes to entities such as orders, opportunities, contacts, accounts, products, interactions from user journeys and more.

Our Data Lake is seamlessly integrated with Clickhouse for data warehousing, enabling the users to leverage Business Intelligence (BI) tools and create insightful reports. This documentation will guide you through the key components of the Data Lake feature, including data schemas, usage, and credential management.

Expand Down Expand Up @@ -72,6 +72,68 @@ Contains information about related entities associated with a workflow execution

To know more about workflow execution details, please refer [here](/api/workflow-execution#tag/Workflows/operation/getExecution)

### 3. Journey Analytics

This dataset showcases the user sessions created for each journey and the interactions that occur within those sessions.

The schema for journey sessions is as follows:

```json
{
"id": "string", // Unique identifier for the session (UUID)
"org_id": "string", // Unique identifier for the organization associated with the session (UUID)
"session_id": "string", //
"journey_id": "string", // Unique identifier for the user journey to which the session belongs (UUID)
"type": "datetime", // Timestamp indicating when the session started
"details": "string", // Additional details about the session, in JSON format
"created_at": "datetime" // Timestamp indicating when the session ended
}
```

Fields of interest in this schema include:

**details**: This field possibly contains basic user's browser session details
- `deviceType`: The type of device used by the user (e.g., mobile, desktop).
- `osType`: The operating system of the device (e.g., iOS, Android, Windows).
- `browserTypeAndVersion`: The browser name and version used by the user (e.g., Chrome 92, Firefox 89).
- `screenResolution`: The screen resolution of the user's device (e.g., 1920x1080).
- `viewportSize`: The size of the viewport in the browser (i.e., the visible area of the web page).
- `colorDepth`: The color depth of the user's display (e.g., 24).
- `languagePreference`: The preferred language set in the user's browser (e.g., en-US, fr-FR).
- `ip`: The IP address of the user's device when the event was recorded.


The schema for journey events is as follows:

```json
{
"id": "string", // Unique identifier for the event (UUID)
"org_id": "string", // Unique identifier for the organization associated with the event (UUID)
"session_id": "string", // Unique identifier for the session in which the event occurred (UUID)
"journey_id": "string", // Unique identifier for the user journey to which the event belongs (UUID)
"type": "string", // Type of event (e.g., "page_navigation", "journey_submit", "journey_exit")
"details": "string", // Additional details about the event, typically in JSON format
"created_at": "datetime" // Timestamp indicating when the event was recorded
}
```

Fields of interest in this schema include:

**type**: This field contains the type of user interaction or event that occurs on the session
- `journey_load_time`: The time it takes for the journey to load, typically measured from journey API response time.
- `page_navigation`: The event triggered when a user navigates from one page to another within the journey..
- `journey_reset`: The event triggered when a user resets the journey, potentially clearing previous inputs or starting the journey over.
- `journey_submit`: The event triggered when a user completes and submits the journey, often marking the end of the user interaction.
- `journey_exit`: The event triggered when a user exits the journey, possibly by closing the journey tab in the browser.

**details**: This field contains additional details about the type of event that occurs. Basically from and to pages on page navigation event, the step on which journey reset occurs, the time of journey submission and exit, time taken for journeys to load etc.


- **{org_id}_journey_sessions_final**
This table stores the above attributes related to journey sessions.

- **{org_id}_journey_events_final**
This table stores the above attributes related to journey events.

## Setting up Datalake

Expand Down Expand Up @@ -118,6 +180,25 @@ group by

This SQL query retrieves data about opportunities created over time, extracts relevant information from the JSON payload, and aggregates it by year and month, providing insights into opportunities created during different periods.

**Example 2: Reporting Journey Sessions Created Over Time**
Suppose you need to create a report showing journey sessions created over time for a specific journey. You can use SQL to accomplish this task:

``` sql
select
journey_id,
start_time,
end_time,
details
from
{org_id}_journey_sessions_final
where
journey_id = {your_journey_id}
and
start_time > '2024-08-01 00:00:00'
```

This SQL query retrieves the journey sessions created over time for a specific journey

![Datalake page](/img/datalake/opportunity-time-series.png)

You can use any SQL client to connect to the Clickhouse Data Warehouse (DWH) using the credentials provided. For more detailed information, please refer to [this link](https://clickhouse.com/docs/en/integrations/datagrip).
Expand Down Expand Up @@ -157,6 +238,7 @@ Example ODBC Connection String - ```Driver={ClickHouse ODBC Driver (ANSI)};Serv
- **Report Creation**: With the data modeling in place, you can start creating visually appealing reports within Power BI. Utilize various chart types, tables, and visuals to convey insights effectively.

![Power BI Report](/img/datalake/powerbi-report.png)
![Power BI Report](/img/datalake/powerbi-journey-analytics-report.png)


## Interactive PowerBI report using Epilot Datalake
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit bf9f3fd

Please sign in to comment.