Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Kafka Connect support #105

Closed
see-quick opened this issue Nov 5, 2024 · 23 comments · Fixed by #117
Closed

Add Kafka Connect support #105

see-quick opened this issue Nov 5, 2024 · 23 comments · Fixed by #117
Labels
enhancement New feature or request needs-proposal
Milestone

Comments

@see-quick
Copy link
Member

see-quick commented Nov 5, 2024

At present, our test containers do not incorporate any client applications, such as producers or consumers. However, there are scenarios where it would be advantageous to instantiate clients within Docker on the localhost. This approach would enable the seamless use of Docker network aliases and obviate the necessity to expose any ports.

I propose that we can add [1] clients. At the start, we can consider just producer and consumer and then we can extend it more to streams etc. The proposed naming conventions are ProducerContainer and ConsumerContainer, which would allow users to specify multiple configurations through environment variables or builder parameters.

Alternative:

  1. I think a better way would be to just use our images (test-container-images) and call producer/consumer scripts as we do for Kafka/ZK.

[1] - https://github.com/strimzi/test-clients

@see-quick see-quick added enhancement New feature or request needs-triage labels Nov 5, 2024
@see-quick see-quick added this to the 0.109.0 milestone Nov 5, 2024
@im-konge
Copy link
Member

Triaged on 14.11.2024: This makes sense and should be implemented. But we should not use the test-clients, as it can create a cyclic dependency. So we should go with the scripts for producer/consumer.

@im-konge im-konge added help wanted Extra attention is needed and removed needs-triage labels Nov 14, 2024
@scholzj
Copy link
Member

scholzj commented Nov 14, 2024

This approach would enable the seamless use of Docker network aliases and obviate the necessity to expose any ports.

Why do you need a test container when you keep it running for itself?

@see-quick
Copy link
Member Author

see-quick commented Nov 15, 2024

This approach would enable the seamless use of Docker network aliases and obviate the necessity to expose any ports.

Using test containers for clients within the Docker network allows these clients to communicate easily with other services (Kafka, Keycloak etc.) using Docker aliases. Even if the test container seems to be "running by itself", it's part of the Docker network where it interacts with other containers So, we very much simplify overall networking.

@Frawless
Copy link
Member

This approach would enable the seamless use of Docker network aliases and obviate the necessity to expose any ports.

Using test containers for clients within the Docker network allows these clients to communicate easily with other services (Kafka, Keycloak etc.) using Docker aliases. Even if the test container seems to be "running by itself", it's part of the Docker network where it interacts with other containers So, we very much simplify overall networking.

I am not sure if I completely understand but as part of test-client ITs, I didn't have any problems with running clients code outside of testcontainer and connect it to StrimziTestcontainer Kafka cluster.

@see-quick
Copy link
Member Author

This approach would enable the seamless use of Docker network aliases and obviate the necessity to expose any ports.

Using test containers for clients within the Docker network allows these clients to communicate easily with other services (Kafka, Keycloak etc.) using Docker aliases. Even if the test container seems to be "running by itself", it's part of the Docker network where it interacts with other containers So, we very much simplify overall networking.

I am not sure if I completely understand but as part of test-client ITs, I didn't have any problems with running clients code outside of testcontainer and connect it to StrimziTestcontainer Kafka cluster.

You’re correct that running client code outside of Docker can work without issues in many scenarios. However, when it comes to more complex setups involving OAuth authentication, running clients inside Docker containers becomes easier to handle.

OAuth often requires that the client and server share the same OAuth issuer URI. This means the client needs to access the OAuth server (e.g., Keycloak) using the same network address that the Kafka broker uses. So for instance when you are running clients outside of Docker, you might need to manipulate your /etc/hosts file to ensure that the hostnames of the authorization server are resolved correctly.

@scholzj
Copy link
Member

scholzj commented Nov 15, 2024

This approach would enable the seamless use of Docker network aliases and obviate the necessity to expose any ports.

Using test containers for clients within the Docker network allows these clients to communicate easily with other services (Kafka, Keycloak etc.) using Docker aliases. Even if the test container seems to be "running by itself", it's part of the Docker network where it interacts with other containers So, we very much simplify overall networking.

I think that beats the purpose of the test container. The purpose is to run Kafka cluster that is used by your test. Not to run some complete Kafka infrastructure encapsulated into itself.

@scholzj
Copy link
Member

scholzj commented Nov 15, 2024

OAuth often requires that the client and server share the same OAuth issuer URI. This means the client needs to access the OAuth server (e.g., Keycloak) using the same network address that the Kafka broker uses. So for instance when you are running clients outside of Docker, you might need to manipulate your /etc/hosts file to ensure that the hostnames of the authorization server are resolved correctly.

The value of the OAuth support is that I can connect to it with a client from inside the unit/integration test. It is not a framewokr for infrastructure as a code. If you want to test that something works for you within a closed up Docker network, use Docker compose for example, but there is no need to integrate it into Java unit/integration tests.

@Frawless
Copy link
Member

You’re correct that running client code outside of Docker can work without issues in many scenarios. However, when it comes to more complex setups involving OAuth authentication, running clients inside Docker containers becomes easier to handle.

I thought the clients will be just for testing purposes of our test container (like another readiness/health check for the cluster in ITs), but it seems I misunderstood it right? The real reason is to provide client implementation to users that they could use?

@see-quick
Copy link
Member Author

I understand your perspective. If you don't see this (mentioned all above) as a benefit I am okay with closing this... :)

@Frawless
Copy link
Member

I will follow up on #113 (comment) here is this is the primary source of info for the issue. cc @im-konge @mimaison

I think the only suitable and useful solution would be to also implement testcontainer interface as part of test-clients instead reimplementing the clients in strimzi/test-container. As we already have images, we can use them in strimzi/test-container but it will open a window for chicken-egg problem with supported Kafka versions. That's the reason why it should be part of test-clients.

On the other hand, do we really need it? How many known users asked for it? As Lukas mentioned, is super-easy to setup GenericTestContainer (see https://github.com/strimzi/test-clients/blob/main/clients/src/test/java/io/strimzi/integration/KafkaClientWithRegistryIT.java#L48) so I think with clients images it could be something like:

        producer = new GenericContainer<>("quay.io/strimzi-test-clients/test-clients:latest-kafka-3.9.0")
            .withNetwork(Network.SHARED)
            .withExposedPorts(METRICS_EXPORTER_PORT)
            // Mount exporter here
            .withEnvs(Map.of("CLIENT_TYPE", "KafkaProducer",
            "BOOTSTRAP_SERVERS","kafka:9092",
            "TOPIC","my-topic",
            "ADDITIONAL_CONFIG","some-exporter-config=my-value\nsome-another-config=my-value-2"
            ))
            .waitingFor(Wait.forHttp("/metrics-exporter")
                .forStatusCode(200));
        producer.start();

and then just do whatever you want.

Do we really need wrapper around this? I don't think so. And if so, then just provide a class as part of test-clients for next release, nothing else. If we are going to release some maven artifacts as part of test-clients releases anyway, I think it won't harm, but currently you can do exactly what I share within the code snippet.

@mimaison
Copy link
Contributor

So I looked at strimzi/test-clients and started using it as @Frawless suggested. I found a small issue that prevents me from adding the reporter to the classpath. I opened strimzi/test-clients#116 to let users customize the classpath.

The last remaining issues are the Admin client and Connect:

  • For the Admin client it's tricky as in strimzi/test-clients it's only used to run a single command so we don't really have the time to hit the metrics endpoint before it's closed.
  • What do you recommend for Connect? Would it make sense to add a way to run it in strimzi/test-clients?

Apart from these 2 issues strimzi/test-clients works fine for my use cases, so we may be able to just close this issue.

@im-konge
Copy link
Member

I opened strimzi/test-clients#116 to let users customize the classpath.

Thanks, I will have a look at it :)

For the Admin client it's tricky as in strimzi/test-clients it's only used to run a single command so we don't really have the time to hit the metrics endpoint before it's closed.

Yeah we implemented the admin-client to be a user-friendly CLI tool that just encapsulates the KafkaAdmin client -> but only in areas that are useful for us in the tests.
So for your use case, what would you need exactly?

What do you recommend for Connect? Would it make sense to add a way to run it in strimzi/test-clients?

What is needed in order to "add a way to run it in strimzi/test-clients"?

@mimaison
Copy link
Contributor

Yeah we implemented the admin-client to be a user-friendly CLI tool that just encapsulates the KafkaAdmin client -> but only in areas that are useful for us in the tests.
So for your use case, what would you need exactly?

I'd need to keep the client running long enough so I can query its metrics endpoint. We could have it rerun a command a number of times for example. That would be quite similar to the producer client where we keep it running by making it send MESSAGE_COUNT records.

What is needed in order to "add a way to run it in strimzi/test-clients"?

I'd need an image that starts Kafka Connect. Then my test could start connectors via the Connect REST API, and finally check the metrics.

@mimaison
Copy link
Contributor

mimaison commented Nov 22, 2024

For context, check strimzi/metrics-reporter#67 to see the tests I implemented with strimzi/test-clients.

@im-konge
Copy link
Member

I'd need to keep the client running long enough so I can query its metrics endpoint. We could have it rerun a command a number of times for example. That would be quite similar to the producer client where we keep it running by making it send MESSAGE_COUNT records.

Okay, I will think about the best way how to cover this in the repo. Maybe we can have some option there to "watch" the Kafka topics or something. That would be maybe useful for some debugging and it can maybe be enough for your use-case. Or for example wait for Kafka topic creation (for your test it will not be created) for some time -> that should be also useful for those using the admin-client.

I'd need an image that starts Kafka Connect. Then my test could start connectors via the Connect REST API, and finally check the metrics.

FMPOV it doesn't suite the purpose of the test-clients repo, which should be for the producer/consumer - both Kafka and HTTP - and admin/streams clients - in some basic way. I understand how you mean it, but I think it would make the repository something that it should not be. Providing Connect image would be outside of the scope.

@Frawless
Copy link
Member

FMPOV it doesn't suite the purpose of the test-clients repo, which should be for the producer/consumer - both Kafka and HTTP - and admin/streams clients - in some basic way. I understand how you mean it, but I think it would make the repository something that it should not be. Providing Connect image would be outside of the scope.

Isn't it something that could/should be provided by strimzi/test-container ?

@see-quick
Copy link
Member Author

FMPOV it doesn't suite the purpose of the test-clients repo, which should be for the producer/consumer - both Kafka and HTTP - and admin/streams clients - in some basic way. I understand how you mean it, but I think it would make the repository something that it should not be. Providing Connect image would be outside of the scope.

Isn't it something that could/should be provided by strimzi/test-container?

Do you mean to encapsulate KafkaConnect/KMM2 scripts as a container?

@Frawless
Copy link
Member

Do you mean to encapsulate KafkaConnect/KMM2 scripts as a container?

yes

@see-quick
Copy link
Member Author

Do you mean to encapsulate KafkaConnect/KMM2 scripts as a container?

yes

Yeah, I was also thinking about adding KafkaConnect and KMM2 not sure what others think about it.

@mimaison
Copy link
Contributor

I opened a PR adding basic Kafka Connect support: #117

@see-quick see-quick modified the milestones: 0.109.0, 0.110.0 Nov 26, 2024
@ppatierno ppatierno changed the title Add Clients support Add Kafka Connect support Nov 28, 2024
@ppatierno
Copy link
Member

Triaged on 28/11/2024: agreed on having a proposal about having KafkaConnect support within test-container. @mimaison is going to work on it.

@mimaison
Copy link
Contributor

I opened a proposal: strimzi/proposals#141

@ppatierno
Copy link
Member

Triaged on 12/12/2024: the proposal was opened by @mimaison so we'll keep this and add the support.

@ppatierno ppatierno removed help wanted Extra attention is needed needs-triage labels Dec 12, 2024
@see-quick see-quick linked a pull request Jan 13, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request needs-proposal
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants