Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify format for fullyQualifiedName #16

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

tritemio
Copy link

@tritemio tritemio commented Jul 8, 2022

Summary

With this PR I propose to use the common convention for fullyQualifiedName to be a dot-separated path, aligning this field value with the Open Metadata specifications.

Context

The field fullyQualifiedName occurs in many entities in the Data Product specifications and it is currently a free-form string "description". This somewhat overlaps with fields name (that is the human-readable display name) and description (that should be a long-form description). The current purpose of the field is unclear IMHO.

Proposal

In SWE, a "fully qualified name" represents a string representation that includes a path like identifier package.module.class_name, as opposed to the simple name that would be class_name.

In the Open Metadata Specifications, the field fullyQualifiedName is specified following the above convention across many entities. Here a few examples:

  • table: serviceName.databaseName.tableName
  • database: ServiceName.DatabaseName
  • pipeline: ServiceName.PipelineName

The Open Metadata specifications don't have the concept of Data Products, Output Ports and other entities related to the Data Mesh paradigm, but nonetheless the purpose of the field is similar.

I propose to align the field fullyQualifiedName in the Data Product specifications to follow the common convention followed by Open Metadata.

Here I report an example of the fields with the proposed convention:

  • dataproduct fullyQualifiedName: domain_name.dataproduct_name
  • outputport fullyQualifiedName: domain_name.dataproduct_name.outputport_name
  • workload fullyQualifiedName: domain_name.dataproduct_name.workload_name
  • storage_area fullyQualifiedName: domain_name.dataproduct_name.storage_area_name
  • observability fullyQualifiedName: domain_name.dataproduct_name.observability_name

@agile-lab
Copy link
Contributor

@SpyQuel what do you think about this proposal ?

@SpyQuel
Copy link
Contributor

SpyQuel commented Oct 14, 2022

I think this proposal is actually excellent.

I agree when it is said that "the current purpose of the field is unclear", since we are inheriting it from the OpenMetadata specification, but we are not actually using this field at all. Making it more structured and adding real value to it would make it more interesting.

We can improve this proposal by making the proposed structure something verified by the CUE file. IMHO we should keep this field optional, but we can enforce that when it is present it should be compliant with the proposed structure.

@tritemio
Copy link
Author

IMHO we should keep this field optional, but we can enforce that when it is present it should be compliant with the proposed structure.

I agree on this 👍

@agile-lab
Copy link
Contributor

Can we move forward on this ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants