[Camel 20291] Fixed many unit tests that failed due to threading issues; disabled the remaining broken unit tests. #12739

cziesman · 2024-01-10T18:17:39Z

Description

When running mvn install or mvn test, many Camel unit tests fail that are based on ContextTestSupport. These failures are due to threading issues that rarely occur when testing individual modules. CamelTestSupport, which is used for testing classes outside of Camel Core, uses ThreadLocal instances for the CamelContext, the ProducerTemplate, and the ConsumerTemplate. ContextTestSupport was updated to use ThreadLocal as well. In addition, the property camel.surefire.reuseForks in parent/pom.xml now defaults to false. Flaky unit tests were disabled along with unit tests that require Docker. Parallel execution of JMS tests was also disabled.

Target

[ X] I checked that the commit is targeting the correct branch (note that Camel 3 uses camel-3.x, whereas Camel 4 uses the main branch)

Tracking

[ X] If this is a large change, bug fix, or code improvement, I checked there is a JIRA issue filed for the change (usually before you start working on it).

Apache Camel coding standards and style

[ X] I checked that each commit in the pull request has a meaningful subject line and body.
[ X] I have run mvn clean install -DskipTests locally and I have committed all auto-generated changes

…ted reuseForks property to default to 'false'

…sts are failing and whether or not they are even useful

…emoved parallel execution from JMS tests

github-actions · 2024-01-10T18:17:52Z

🌟 Thank you for your contribution to the Apache Camel project! 🌟

🤖 CI automation will test this PR automatically.

🐫 Apache Camel Committers, please review the following items:

First-time contributors require MANUAL approval for the GitHub Actions to run
You can use the command /component-test (camel-)component-name1 (camel-)component-name2.. to request a test from the test bot.
You can label PRs using build-all, build-dependents, skip-tests and test-dependents to fine-tune the checks executed by this PR.
Build and test logs are available in the Summary page. Only Apache Camel committers have access to the summary.
⚠️ Be careful when sharing logs. Review their contents before sharing them publicly.

cziesman · 2024-01-10T18:35:49Z

Jira tickets will be created for each module with broken unit tests so that they can be addressed individually.

orpiske

Mass-disabling tests is the incorrect approach.

oscerd

Many of the reported ones are not failing for me. Also I do believe disabling them at all doesn't help. In the end we'll forget to fix them and we won't notice regressions, if any. So -1 for me.

cziesman · 2024-01-12T16:38:45Z

I have already created Jira tickets for every module with failing tests, so I don't believe that the failures will be forgotten. I disagree that disabling the tests is the wrong approach. I inquired about broken tests many weeks ago and was told this is a "known" problem, but nobody has been working to fix them. In my opinion, broken unit tests are worse than no tests at all. Also, many of the tests pass when building a single module, but they fail when building all of Camel at once. This requires a lot of investigation to determine the source of the problems, which is why I created Jira tickets for each broken module. Finally, many of the tests are integration tests, not unit tests, and simply should not be run at all by mvn install.

oscerd · 2024-01-12T16:51:51Z

There are multiple reasons to not merge this PR: the main one is disabling reuseForks. This will slow down the build on CI and the test won't run quick. This is a great limitation, as our source of truth is the CI build on Jenkins and GH action (with some flaky tests). You have your point of view at that's fine, but disabling everything and removing reuseFork is not what we expect to do. Running a full build locally with around 25k tests is time consuming and except particular scenario, doesn't make sense for local development. That's the reason why is -1 for me. We can focus on fixing the tests or improving the build, but not at the cost of slowing down tests.

cziesman · 2024-01-12T17:04:56Z

Many tests fail due to threading issues that are related to setting camel.surefire.reuseForks to true. I'm not proposing to disable forks permanently, just until the root causes of the test failures are found and fixed. Yes, the full build with tests runs more slowly and takes about 3.5 hours to complete on my MacBook Pro. But, because the tests are broken, nobody runs the full build with unit tests anyway. Can you recommend an alternative approach? Because I don't see a better one.

orpiske · 2024-01-12T17:11:34Z

Many of the reported ones are not failing for me. Also I do believe disabling them at all doesn't help. In the end we'll forget to fix them and we won't notice regressions, if any. So -1 for me.

-1 from me as well.

orpiske · 2024-01-12T17:25:23Z

Many tests fail due to threading issues that are related to setting camel.surefire.reuseForks to true. I'm not proposing to disable forks permanently, just until the root causes of the test failures are found and fixed. Yes, the full build with tests runs more slowly and takes about 3.5 hours to complete on my MacBook Pro. But, because the tests are broken, nobody runs the full build with unit tests anyway. Can you recommend an alternative approach? Because I don't see a better one.

Our tests are not perfect. We know that we (I, in particular) have been fixing them for quite a while. That said, to say that they are "broken" is a very big stretch.

The tests run on the ASF CI, in multiple platforms (x86, ppc64le, s390x)
The tests run on GitHub Actions
The tests are regularly run manually by me (on arm, x86, ppc64le, s390x) and other core contributors.

And, although we do have occasional failures, the incidence is very small.

So, what can be done to improve the reliability of the tests? Look for flaky tests using the dashboard from The ASF Develocity environment.

When finding a problem on a test, focus on the source of the problem: if there is a threading problem, try to reproduce it, provide traces and dumps for the community so we can fix or assist with the fix/review (it's very likely the fix will require internal knowledge of the core code).

oscerd · 2024-01-12T17:38:24Z

My recommended approach is trusting the CI build.

Craig Ziesman added 3 commits January 3, 2024 14:33

Added ThreadLocal to handle threading issues in bulk unit tests; upda…

eeb8731

…ted reuseForks property to default to 'false'

Disabled ForegroundTimeTaskTest until it can be determined why the te…

b1c2575

…sts are failing and whether or not they are even useful

disabled flaky unit tests; disabled unit tests that require Docker; r…

31f71c1

…emoved parallel execution from JMS tests

github-actions bot added components components-kafka core components-jms core-build-and-dependencies camel-4 labels Jan 10, 2024

Fixed merge conflicts

ac1cc59

cziesman changed the title ~~[Camel 20291] Fixed many unit tests that failed due to tthreading issues; disabled the remaining broken unit tests.~~ [Camel 20291] Fixed many unit tests that failed due to threading issues; disabled the remaining broken unit tests. Jan 10, 2024

orpiske requested changes Jan 12, 2024

View reviewed changes

oscerd requested changes Jan 12, 2024

View reviewed changes

cziesman closed this Jan 15, 2024

cziesman deleted the CAMEL-20291 branch January 15, 2024 17:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Camel 20291] Fixed many unit tests that failed due to threading issues; disabled the remaining broken unit tests. #12739

[Camel 20291] Fixed many unit tests that failed due to threading issues; disabled the remaining broken unit tests. #12739

cziesman commented Jan 10, 2024

github-actions bot commented Jan 10, 2024

cziesman commented Jan 10, 2024

orpiske left a comment

oscerd left a comment

cziesman commented Jan 12, 2024

oscerd commented Jan 12, 2024

cziesman commented Jan 12, 2024

orpiske commented Jan 12, 2024

orpiske commented Jan 12, 2024

oscerd commented Jan 12, 2024

[Camel 20291] Fixed many unit tests that failed due to threading issues; disabled the remaining broken unit tests. #12739

[Camel 20291] Fixed many unit tests that failed due to threading issues; disabled the remaining broken unit tests. #12739

Conversation

cziesman commented Jan 10, 2024

Description

Target

Tracking

Apache Camel coding standards and style

github-actions bot commented Jan 10, 2024

cziesman commented Jan 10, 2024

orpiske left a comment

Choose a reason for hiding this comment

oscerd left a comment

Choose a reason for hiding this comment

cziesman commented Jan 12, 2024

oscerd commented Jan 12, 2024

cziesman commented Jan 12, 2024

orpiske commented Jan 12, 2024

orpiske commented Jan 12, 2024

oscerd commented Jan 12, 2024