A Data Product should already exist in order to attach the new components to it. You probably also want some kind of Storage Area available to read from/write to.
This section includes the basic information that any Component of must have:
- Name: Required. The name of the component.
- Fully Qualified Name: Fully qualified name of the component.
- Description: Required. Help others understand what this component is for. What data will it store?
- Domain: Required. Domain of the Data Product this component belongs to. Be sure to choose it correctly as otherwise you won't find your Data Product below.
- Data Product: Required. Data Product this component belongs to. Be sure to choose the right one as it cannot be changed.
- Identifier: Autogenerated from the information above. A unique identifier for the component. It will not be editable after creation and is a string composed of [a-zA-Z] separated by any of [-_].
- Development Group: Automatically selected from the Data Product metadata. Data Product development group.
- Dependencies: A component could depend on other components in the same Data Product. This information will be used to deploy the components in such an order that their dependencies already exist.
- Tags: Tags for the component.
Example:
Field name | Example value |
---|---|
Name | Vaccinations DBT Workload |
Description | Creates a DBT project to clean vaccinations table |
Domain | domain:healthcare |
Data Product | system:healthcare.vaccinationsdp.0 |
Identifier | Will look something like this: healthcare.vaccinationsdp.0.vaccinations-dbt-workload |
Development Group | Might look something like this: group:datameshplatform Depends on the Data Product development group |
*Dependencies | |
Tags |
- Project Name: Give a name to the dbt project, by default the name "dmb_dbt_transform" will be chosen.
- Storage Area: Underlying Storage Area containing the database and schema information the dbt project will target to transform data. It must store its database and schema information under
spec.mesh.specific.database
andspec.mesh.specific.schema
respectively. - Database: Name of the database in the Warehouse, retrieved from the chosen Storage Area.
- Schema: Name of the schema inside the Database specified above, retrieved from the chosen Storage Area.
Example:
Field name | Example value |
---|---|
Project Name | vaccinations_dbt_transform |
Storage Area | Snowflake Vaccinations Storage Area |
Database | HEALTHCARE |
Schema | VACCINATIONS |
After this the system will show you the summary of the template, and you can go back and edit or go ahead and create the Component. With the examples values given here it should look something like this:
After clicking on "Create" the registering of the Component will start. If no errors occurred it will go through the 3 phases (Fetching, Publishing and Registering) and will give you the links to the newly created Repository and the component in the Catalog.
The new repository will contain a standard skeleton of a dbt project as generated by dbt init
with the small exception that all the files except the dbt-project.yml
will be contained in a sub-folder:
skeleton
├─ dbt_project.yml
├─ catalog-info.yml
├─ dbt/
│ ├─ analyses/
│ ├─ macros/
│ ├─ models/
│ ├─ seeds/
│ ├─ snapshots/
│ ├─ tests/
Now you can clone the newly created repository and work on the dbt project inside the dbt/ folder.
Be careful not to delete or modify the catalog-info.yml
as well as keep the project structure as given.