Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Camel 20291] Fixed many unit tests that failed due to threading issues; disabled the remaining broken unit tests. #12739

Closed
wants to merge 4 commits into from

Conversation

cziesman
Copy link
Contributor

Description

When running mvn install or mvn test, many Camel unit tests fail that are based on ContextTestSupport. These failures are due to threading issues that rarely occur when testing individual modules. CamelTestSupport, which is used for testing classes outside of Camel Core, uses ThreadLocal instances for the CamelContext, the ProducerTemplate, and the ConsumerTemplate. ContextTestSupport was updated to use ThreadLocal as well. In addition, the property camel.surefire.reuseForks in parent/pom.xml now defaults to false. Flaky unit tests were disabled along with unit tests that require Docker. Parallel execution of JMS tests was also disabled.

Target

  • [ X] I checked that the commit is targeting the correct branch (note that Camel 3 uses camel-3.x, whereas Camel 4 uses the main branch)

Tracking

  • [ X] If this is a large change, bug fix, or code improvement, I checked there is a JIRA issue filed for the change (usually before you start working on it).

Apache Camel coding standards and style

  • [ X] I checked that each commit in the pull request has a meaningful subject line and body.

  • [ X] I have run mvn clean install -DskipTests locally and I have committed all auto-generated changes

Craig Ziesman added 3 commits January 3, 2024 14:33
…ted reuseForks property to default to 'false'
…sts are failing and whether or not they are even useful
Copy link
Contributor

🌟 Thank you for your contribution to the Apache Camel project! 🌟

🤖 CI automation will test this PR automatically.

🐫 Apache Camel Committers, please review the following items:

  • First-time contributors require MANUAL approval for the GitHub Actions to run

  • You can use the command /component-test (camel-)component-name1 (camel-)component-name2.. to request a test from the test bot.

  • You can label PRs using build-all, build-dependents, skip-tests and test-dependents to fine-tune the checks executed by this PR.

  • Build and test logs are available in the Summary page. Only Apache Camel committers have access to the summary.

  • ⚠️ Be careful when sharing logs. Review their contents before sharing them publicly.

@cziesman cziesman changed the title [Camel 20291] Fixed many unit tests that failed due to tthreading issues; disabled the remaining broken unit tests. [Camel 20291] Fixed many unit tests that failed due to threading issues; disabled the remaining broken unit tests. Jan 10, 2024
@cziesman
Copy link
Contributor Author

Jira tickets will be created for each module with broken unit tests so that they can be addressed individually.

Copy link
Contributor

@orpiske orpiske left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mass-disabling tests is the incorrect approach.

Copy link
Contributor

@oscerd oscerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many of the reported ones are not failing for me. Also I do believe disabling them at all doesn't help. In the end we'll forget to fix them and we won't notice regressions, if any. So -1 for me.

@cziesman
Copy link
Contributor Author

I have already created Jira tickets for every module with failing tests, so I don't believe that the failures will be forgotten. I disagree that disabling the tests is the wrong approach. I inquired about broken tests many weeks ago and was told this is a "known" problem, but nobody has been working to fix them. In my opinion, broken unit tests are worse than no tests at all. Also, many of the tests pass when building a single module, but they fail when building all of Camel at once. This requires a lot of investigation to determine the source of the problems, which is why I created Jira tickets for each broken module. Finally, many of the tests are integration tests, not unit tests, and simply should not be run at all by mvn install.

@oscerd
Copy link
Contributor

oscerd commented Jan 12, 2024

There are multiple reasons to not merge this PR: the main one is disabling reuseForks. This will slow down the build on CI and the test won't run quick. This is a great limitation, as our source of truth is the CI build on Jenkins and GH action (with some flaky tests). You have your point of view at that's fine, but disabling everything and removing reuseFork is not what we expect to do. Running a full build locally with around 25k tests is time consuming and except particular scenario, doesn't make sense for local development. That's the reason why is -1 for me. We can focus on fixing the tests or improving the build, but not at the cost of slowing down tests.

@cziesman
Copy link
Contributor Author

Many tests fail due to threading issues that are related to setting camel.surefire.reuseForks to true. I'm not proposing to disable forks permanently, just until the root causes of the test failures are found and fixed. Yes, the full build with tests runs more slowly and takes about 3.5 hours to complete on my MacBook Pro. But, because the tests are broken, nobody runs the full build with unit tests anyway. Can you recommend an alternative approach? Because I don't see a better one.

@orpiske
Copy link
Contributor

orpiske commented Jan 12, 2024

Many of the reported ones are not failing for me. Also I do believe disabling them at all doesn't help. In the end we'll forget to fix them and we won't notice regressions, if any. So -1 for me.

-1 from me as well.

@orpiske
Copy link
Contributor

orpiske commented Jan 12, 2024

Many tests fail due to threading issues that are related to setting camel.surefire.reuseForks to true. I'm not proposing to disable forks permanently, just until the root causes of the test failures are found and fixed. Yes, the full build with tests runs more slowly and takes about 3.5 hours to complete on my MacBook Pro. But, because the tests are broken, nobody runs the full build with unit tests anyway. Can you recommend an alternative approach? Because I don't see a better one.

Our tests are not perfect. We know that we (I, in particular) have been fixing them for quite a while. That said, to say that they are "broken" is a very big stretch.

  • The tests run on the ASF CI, in multiple platforms (x86, ppc64le, s390x)
  • The tests run on GitHub Actions
  • The tests are regularly run manually by me (on arm, x86, ppc64le, s390x) and other core contributors.

And, although we do have occasional failures, the incidence is very small.

So, what can be done to improve the reliability of the tests? Look for flaky tests using the dashboard from The ASF Develocity environment.

When finding a problem on a test, focus on the source of the problem: if there is a threading problem, try to reproduce it, provide traces and dumps for the community so we can fix or assist with the fix/review (it's very likely the fix will require internal knowledge of the core code).

@oscerd
Copy link
Contributor

oscerd commented Jan 12, 2024

My recommended approach is trusting the CI build.

@cziesman cziesman closed this Jan 15, 2024
@cziesman cziesman deleted the CAMEL-20291 branch January 15, 2024 17:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants