Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: message validation for kafka #2266

Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
103 changes: 103 additions & 0 deletions pages/docs/tutorials/message-validation-for-kafka.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
---
title: Message validation for Kafka
description: The goal of this tutorial is to help you validate messages for Kafka.
weight: 150
---

## Introduction

This tutorial will teach you how to validate messages within a Kafka-based system using AsyncAPI specifications.

Validating messages for Kafka involves defining message schemas using AsyncAPI and then using those schemas to validate incoming and outgoing messages in your Kafka-based applications.

At the end of the tutorial, you should understand the importance of message validation, specify message schemas using AsyncAPI, and apply validation to ensure that the messages exchanged between different components in a Kafka-based system are consistent and adhere to predefined rules.


```mermaid
graph TB
A[Producer Service] -->|Publish Message| B[AsyncAPI Validation]
B -->|Valid Message| C[Kafka Broker]
C -->|Consume Message| D[AsyncAPI Validation]
D -->|Valid Message| E[Consumer Service]

subgraph "Message Validation with AsyncAPI"
B[Validate Message<br>against AsyncAPI Schema]
D[Validate Message<br>against AsyncAPI Schema]
end

subgraph "Kafka Messaging System"
C[Kafka Broker<br><br>Topic: userSignUp]
end

style A fill:#f9d,stroke:#333,stroke-width:2px
style E fill:#f9d,stroke:#333,stroke-width:2px
style B fill:#9cf,stroke:#333,stroke-width:2px
style D fill:#9cf,stroke:#333,stroke-width:2px
style C fill:#ccf,stroke:#333,stroke-width:2px

A -->|Publish Invalid Message| B
B -->|Invalid Message| F[Error Handling]
D -->|Invalid Message| G[Error Handling]
```

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a summarizing diagram of the Introduction info.

## Background context

Message validation involves checking that the messages exchanged between systems adhere to a predefined structure or schema. It helps ensure the data being sent or received is accurate, complete, and in the expected format.

Kafka transports messages between producers and consumers, while AsyncAPI provides a contract (schema) to which those messages must adhere. Validation tools can use the AsyncAPI specifications to check whether the messages meet the criteria before they are sent or processed.

[Kafka](https://kafka.apache.org/documentation/#intro_nutshell) is a distributed event streaming platform for building real-time data pipelines and apps. It's horizontally scalable, fault-tolerant, and swift.

[AsyncAPI](https://www.asyncapi.com/docs) is an open-source initiative that provides a specification for defining and documenting synchronous and asynchronous APIs. It acts as a contract that ensures all system components agree on the structure of messages for effective communication.


To follow this tutorial, you need:

* A knowledge of Kafka and AsyncAPI.
* An understanding of Event-driven Architectures (EDAs).

There are several ways to implement message validation for
## Installation guide




## Broker-level message validation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heyo @CynthiaPeter, Lukasz clarified to me today in call that there is no broker level message validation. Broker just takes what you have and passes it forward; broker does not handle validation. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lukasz is saying we can narrow the scope of the tutorial... instead of message validation broadly, we focus on ONE kind of message validation tutorial. For example, Schema Registry validation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Thank you.

Does this mean that Arya will take it on based off the fact that he is closer to the message?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello!,

Message validation for Kafka does not apply at the broker level but at the client libraries level. These are the ones that - via the Ser/DeSer configuration - may ensure the messages you publish or receive conform to a schema.

However, this Ser/DeSer configuration is optional, and as it is tied to client libraries, it may vary from one language to another or one implementation to another.

That said, I think that it is essential to differentiate runtime validation from build-time validation. As of today, I don't see how AsyncAPI would integrate into runtime validation (trusted today by Ser/DeSer configuration). So I wouldn't have put the "AsyncAPI Validation" block on the critical path between producer and broker and between broker and consumer. Deporting what Ser/DeSer is expected to do to another block would led to so much complexity...

I think that AsyncAPI usage for message validations is possible in a number of cases:

  • At runtime but using a "dark consumer" approach - launching a consumer aside or before real production consumers to do the actual validation,
  • At build-time but in a way where you dissociate producer validation to consumer validation as introduced by this schema:
Capture d’écran 2023-11-09 à 14 40 48

Of course, I don't want to influence things but it may be worth to have a look at what we're doing at Microcks on validation stuffs: https://microcks.io/documentation/guides/avro-messaging/#4-run-asyncapi-tests

Let me know if it's helpful. I would be happy to pursue the discussion!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello Laurent,

Thank you for the insights. I'll look into this tomorrow and come back with specific questions.

@Arya-Gupta if you have time, look at this.


<!---
Integrating message validation with Kafka producers
Validating outgoing messages before sending them to Kafka
--->
At the broker level, message validation is about ensuring that the messages sent to Kafka topics comply with a specific predefined schema. In Kafka, you can use a Schema Registry to manage and validate schemas. Also, ensure that only valid messages are written to the Kafka topics.

Define an AsyncAPI schema
Configure Kafka


## Client-level message validation

<!---
Integrating message validation with Kafka consumers
Validating incoming messages before processing
--->

Generate Code
Implement Validation

## Summary

Combining Kafka's robust message streaming capabilities with AsyncAPI's standardized documentation and tooling, developers can ensure that the messages exchanged in a distributed system are validated effectively, ensuring data integrity and system reliability.



## Next steps

Now that you have completed this tutorial automate the validation and generation process in your continuous integration pipelines.

## Additional resources

<!---
Add Joy's tutorial on how to Create an AsyncAPI document for Kafka Messages using AsyncAPI.

Add Arya's Guide on message validation
--->