Skip to content

Latest commit

 

History

History
69 lines (53 loc) · 4.61 KB

File metadata and controls

69 lines (53 loc) · 4.61 KB

Prerequisites

A Data Product should already exist in order to attach the new components to it. You probably also want some kind of Storage Area available to read from/write to.

Component basic Information

This section includes the basic information that any Component of must have:

  • Name: Required. The name of the component.
  • Fully Qualified Name: Fully qualified name of the component.
  • Description: Required. Help others understand what this component is for. What data will it store?
  • Domain: Required. Domain of the Data Product this component belongs to. Be sure to choose it correctly as otherwise you won't find your Data Product below.
  • Data Product: Required. Data Product this component belongs to. Be sure to choose the right one as it cannot be changed.
  • Identifier: Autogenerated from the information above. A unique identifier for the component. It will not be editable after creation and is a string composed of [a-zA-Z] separated by any of [-_].
  • Development Group: Automatically selected from the Data Product metadata. Data Product development group.
  • Dependencies: A component could depend on other components in the same Data Product. This information will be used to deploy the components in such an order that their dependencies already exist.
  • Tags: Tags for the component.

Example:

Field name Example value
Name Vaccinations DBT Workload
Description Creates a DBT project to clean vaccinations table
Domain domain:healthcare
Data Product system:healthcare.vaccinationsdp.0
Identifier Will look something like this: healthcare.vaccinationsdp.0.vaccinations-dbt-workload
Development Group Might look something like this: group:datameshplatform Depends on the Data Product development group
*Dependencies
Tags

DBT Project Details

  • Project Name: Give a name to the dbt project, by default the name "dmb_dbt_transform" will be chosen.
  • Storage Area: Underlying Storage Area containing the database and schema information the dbt project will target to transform data. It must store its database and schema information under spec.mesh.specific.database and spec.mesh.specific.schema respectively.
  • Database: Name of the database in the Warehouse, retrieved from the chosen Storage Area.
  • Schema: Name of the schema inside the Database specified above, retrieved from the chosen Storage Area.

Example:

Field name Example value
Project Name vaccinations_dbt_transform
Storage Area Snowflake Vaccinations Storage Area
Database HEALTHCARE
Schema VACCINATIONS

After this the system will show you the summary of the template, and you can go back and edit or go ahead and create the Component. With the examples values given here it should look something like this:

After clicking on "Create" the registering of the Component will start. If no errors occurred it will go through the 3 phases (Fetching, Publishing and Registering) and will give you the links to the newly created Repository and the component in the Catalog.

The new repository will contain a standard skeleton of a dbt project as generated by dbt init with the small exception that all the files except the dbt-project.yml will be contained in a sub-folder:

skeleton
├─ dbt_project.yml
├─ catalog-info.yml
├─ dbt/
│  ├─ analyses/
│  ├─ macros/
│  ├─ models/
│  ├─ seeds/
│  ├─ snapshots/
│  ├─ tests/

Now you can clone the newly created repository and work on the dbt project inside the dbt/ folder.

Be careful not to delete or modify the catalog-info.yml as well as keep the project structure as given.