Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Deadlock between DockerClientFactory and RyukResourceReaper with JUnit 5 parallel tests #9120

Open
pkwarren opened this issue Aug 17, 2024 · 6 comments
Labels

Comments

@pkwarren
Copy link

pkwarren commented Aug 17, 2024

Module

Core

Testcontainers version

1.20.1

Using the latest Testcontainers version?

Yes

Host OS

MacOS

Host Arch

arm64

Docker version

Client:
 Version:           27.1.1
 API version:       1.46
 Go version:        go1.21.12
 Git commit:        6312585
 Built:             Tue Jul 23 19:54:12 2024
 OS/Arch:           darwin/arm64
 Context:           desktop-linux

Server: Docker Desktop 4.33.0 (160616)
 Engine:
  Version:          27.1.1
  API version:      1.46 (minimum version 1.24)
  Go version:       go1.21.12
  Git commit:       cc13f95
  Built:            Tue Jul 23 19:57:14 2024
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          1.7.19
  GitCommit:        2bf793ef6dc9a18e00cb12efb64355c2c9d5eb41
 runc:
  Version:          1.7.19
  GitCommit:        v1.1.13-0-g58aa920
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

What happened?

I'm attempting to run tests in parallel with JUnit 5. One test spins up a static ComposeContainer with .withLocalCompose(true) and another spins up a static KafkaContainer. This leads to a deadlock on startup, where one thread acquires the lock on RyukResourceReaper and then fails to acquire the lock in DockerClientFactory, while the other thread does the opposite.

Relevant log output

Kafka container thread:

"testcontainers-lifecycle-0" #35 [41731] daemon prio=5 os_prio=31 cpu=207.18ms elapsed=23.45s tid=0x000000012225ba00 nid=41731 waiting for monitor entry  [0x00000001735fa000]
   java.lang.Thread.State: BLOCKED (on object monitor)
	at org.testcontainers.utility.RyukResourceReaper.maybeStart(RyukResourceReaper.java:74)
	- waiting to lock <0x000000060201f118> (a org.testcontainers.utility.RyukResourceReaper)
	at org.testcontainers.utility.RyukResourceReaper.init(RyukResourceReaper.java:42)
	at org.testcontainers.DockerClientFactory.client(DockerClientFactory.java:232)
	- locked <0x000000060201ee60> (a [Ljava.lang.Object;)
	at org.testcontainers.DockerClientFactory$1.getDockerClient(DockerClientFactory.java:106)
	at com.github.dockerjava.api.DockerClientDelegate.authConfig(DockerClientDelegate.java:109)
	at org.testcontainers.containers.GenericContainer.start(GenericContainer.java:329)

Compose container thread:

"testcontainers-lifecycle-1" #37 [37891] daemon prio=5 os_prio=31 cpu=3.14ms elapsed=23.41s tid=0x0000000122254600 nid=37891 waiting for monitor entry  [0x0000000173a12000]
   java.lang.Thread.State: BLOCKED (on object monitor)
	at org.testcontainers.DockerClientFactory.client(DockerClientFactory.java:185)
	- waiting to lock <0x000000060201ee60> (a [Ljava.lang.Object;)
	at org.testcontainers.DockerClientFactory$1.getDockerClient(DockerClientFactory.java:106)
	at com.github.dockerjava.api.DockerClientDelegate.authConfig(DockerClientDelegate.java:109)
	at org.testcontainers.containers.GenericContainer.start(GenericContainer.java:329)
	at org.testcontainers.utility.RyukResourceReaper.maybeStart(RyukResourceReaper.java:78)
	- locked <0x000000060201f118> (a org.testcontainers.utility.RyukResourceReaper)
	at org.testcontainers.utility.RyukResourceReaper.registerLabelsFilterForCleanup(RyukResourceReaper.java:51)
	at org.testcontainers.containers.ComposeDelegate.registerContainersForShutdown(ComposeDelegate.java:247)
	at org.testcontainers.containers.ComposeContainer.start(ComposeContainer.java:125)
	- locked <0x000000060201f2a8> (a java.lang.Object)

Additional Information

No response

@eddumelendez
Copy link
Member

Hi @pkwarren, can you please provide a project that reproduces the issue?

@pkwarren
Copy link
Author

Here's an example repo showing the problem: https://github.com/pkwarren/testcontainers-issue-9120

@eddumelendez
Copy link
Member

Thanks for sharing @pkwarren. I did some changes because the docker-compose.yml file was not found and also had to set version but can not reproduce the issue. Do you mind taking a look?

@pkwarren
Copy link
Author

the docker-compose.yml file was not found

It should be here: https://github.com/pkwarren/testcontainers-issue-9120/blob/main/docker-compose.yml

also had to set version

I don't follow - where did a version need to be specified?

can not reproduce the issue. Do you mind taking a look?

If you could provide more specifics on what you're doing and any errors you're seeing I'd be happy to update the example project. For me just running ./mvnw clean verify hangs - if you use jstack to look at the PID of the launched Maven surefire process you can see the deadlock.

@eddumelendez
Copy link
Member

the docker-compose.yml file was not found

I had to change from ComposeContainer("docker-compose.yml") to ComposeContainer(new File ("docker-compose.yml"))

I don't follow - where did a version need to be specified?

I was talking about version in docker-compose.yml file. But executing again, I don't need it anymore.

I jus wanted to make sure we have the same code to reproduce. After that, just ran ./mvnw clean verify and everything executed successfully. I am also running on Mac M1 Pro.

@pkwarren
Copy link
Author

Pushed updates to fix the ComposeContainer constructor usage and switched the container in docker-compose.yml to be Kafka (in case we're running into a race condition and starting up nginx is too fast to repro the problem). Hopefully this will allow you to see the same behavior I'm seeing.

I'm on the latest version of Docker desktop (v4.33.0) if it matters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants