-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial support for Avro formats #386
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple of comments
arroyo-rpc/src/schema_resolver.rs
Outdated
} | ||
|
||
/// A schema resolver that return errors when schemas are requests; this is intended | ||
/// to be used when schemas as embedded into the message and we do not expect to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a wording issue here.
|
||
#[async_trait] | ||
pub trait SchemaResolver: Send { | ||
async fn resolve_schema(&self, id: u32) -> Result<Option<String>, String>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use anyhow errors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the errors are meant to be used as the message in a UserError, which takes a string. You could rely on anyhow::Error::to_string
, but that includes various formatting logic that I don't want; the SchemaResolver should be responsible for producing a good String error message that can be shown to the user.
This PR adds initial support for the Avro format, initially only for deserialization in sources:
Initially supported:
[T, null]
Followup work for more complete avro support will include:
The current implementation is pretty inefficient. In order to support avro records with features that we don't currently support statically, currently avro is parsed, then converted to JSON, then deserialized into the struct via our normal JSON deserialization pathway. This allows us to utilize the existing RawJson approach for unsupported features.
This PR also reworks how we interact with Confluent Schema Registry. The schema registry is now configured as part of the Kafka connection, so as to not require redefining it each time you create a table. We also now to the schema resolution as part of connection table creation, rather than having the frontend request the schema and then fill it in.