Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

-Support for Sink Codecs #3081

Closed

Conversation

omkarmmore95
Copy link
Contributor

Signed-off-by: omkarmmore95 [email protected]

Description

support for tabular schema structure

Issues Resolved

resolved #2403

Check List

  • New functionality includes testing.
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed with a real name per the DCO

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: omkarmmore95 <[email protected]>
return "{\"type\":\"array\", \"items\":\"string\"}";
}

public static void iterateRecursively(final StringBuilder mainSchemaBuilder, final String recordSchema,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code appears to be taking a SQL-like schema and then converting it into an Avro schema. Right?

Why not just accept the Avro schema in the pipeline configuration?

This appears to be introducing an undefined language and there may be many different edge cases that we are not accounting for.

If this is using a well defined language, can we perform a model mapping to make sure it is valid? Say for example, it is a Postgresql DDL, can we parse the DDL using a Postgresql library into a Java model? Then we can perform a more accurate mapping.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes @dlvenable , This is Raj's suggestion to support glue like schema and converting it onto avro schema, We are already accepting Avro schema in pipeline YAML itself,
As of now there is no library to do mapping between glue schema like structure to Avro/Parquet schema as there are nested and logical types in Avro/Parquet also.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the syntax of this table structure?

Copy link
Contributor Author

@omkarmmore95 omkarmmore95 Aug 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dlvenable , Shared over mail, as I cant paste here

@dlvenable
Copy link
Member

I'm closing this as obsolete.

@dlvenable dlvenable closed this Oct 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support a generic codec structure for sinks
2 participants