Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add partition key, content-type to EventHub #928

Open
acco opened this issue Jan 29, 2025 — with Linear · 3 comments
Open

Add partition key, content-type to EventHub #928

acco opened this issue Jan 29, 2025 — with Linear · 3 comments

Comments

Copy link
Contributor

acco commented Jan 29, 2025

ContentType

Right now, we send messages to Event Hub in a JSON body like this:

[
  { "Body": "<some base64-encoded body>" },
  { "Body": "<some base64-encoded body>" },
]

However, it looks like you can set a content type for messages. When set to JSON, we don't have to base64 encode the body. Messages look nicer in the Azure console/Data Explorer:

CleanShot 2025-01-28 at 20.33.42@2x.png

Can we set this on individual messages in the batch request?

If not, can we set it in the non-batch request? If so, we should just do it and set batch_size=1 to start.

Partition key

Setting the partition key is important for ordering. This is how SQS derives a group ID (in its pipeline):

consumer
|> Consumers.group_column_values(record_or_event_data)
|> Enum.join(",")

Looking at the Event Hub Go SDK, I'm pretty sure we can add a partition key to individual messages:

https://github.com/Azure/azure-event-hubs-go/blob/master/event.go#L37-L48

However, it's not clear where to put that information.

We should intercept requests to the Azure API to figure out what the payload needs to look like to do this. To intercept, we can either:

  • Change the base URL of the Go SDK to POST a batch of messages to a local webserver
  • Use a proxy (Charles) to capture the request to Azure
@acco acco assigned acco and unassigned acco Jan 29, 2025
@acco
Copy link
Contributor Author

acco commented Jan 29, 2025

@yordis Writing the docs for Event Hubs, these two felt relevant. If we can get partition key relatively cheaply (<1h of investigation), which I think we can do by introspecting one of the SDKs, we should do it.

May need to introspect in order to capture both of these properties.

@acco
Copy link
Contributor Author

acco commented Jan 29, 2025

Argh, just noticed the Go SDK uses AMQP, not REST - so this may not actually be possible so long as we're using the REST API.

We can send individual events with a partition ID, but that will mean needing to manage partition count on our side etc.

Alternatively, we can move over EventHubs to using either our Kafka or AMQP adapters, which will give us more bells and whistles.

Moving to backlog, but we'll look at this again by request.

@acco acco added the backlog label Jan 29, 2025
@yordis
Copy link
Contributor

yordis commented Jan 30, 2025

@acco could we do the same as the RabbitMQ then?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants