Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Body is not thread safe when using a split #6947

Closed
steve-cdl opened this issue Jan 28, 2025 · 10 comments
Closed

Body is not thread safe when using a split #6947

steve-cdl opened this issue Jan 28, 2025 · 10 comments
Labels
bug Something isn't working

Comments

@steve-cdl
Copy link

Bug description

Hi,

We are using Camel 4.9.0 and Quarkus 3.17.7.

Our app receives an XML input and within that XML is a set of codes.
We need to run XSLT transforms for each one of these codes, in parallel.

To do this we are using with xtokenize (although tokenize and XPATH also show this issue).

While doing this, we have found that if anything replaces the body during the split, this new body is picked up by the other splits / threads, causing duplication / errors.

This does not appear to occur if body is never changed.

For example using the following input -

<Books>
    <Book code="AA"/>
    <Book code="BB"/>
</Books>

And the following route -

<route streamCache="false">
        <from uri="direct:xtokenize-main-route"/>
        <split parallelProcessing="true">
            <xtokenize>//Books/Book</xtokenize>
            <log loggingLevel="INFO" logName="xml" message="body is: ${body}"/>
        </split>
</route>

Would result in the correct body being logged within the split.
"body is: <Book code="AA"/>"
"body is: <Book code="BB"/>"

However, if you replace body within the split, for example running an XSLT, the body of the 2nd thread/split will become the amended body of thread 1.

For example -

<route streamCache="false">
        <from uri="direct:xtokenize-main-route"/>
        <split parallelProcessing="true">
            <xtokenize>//Books/Book</xtokenize>
            <toD uri="xslt-saxon:file://path/to/example/stylesheet.xslt"/>
            <log loggingLevel="INFO" logName="xml" message="body is: ${body}"/>
        </split>
</route>

Thread 1 -
"body is: <Book code="AA"/>"
XSLT is ran. Output of XSLT is <hello>world</hello>

Thread 2
"body is: <hello>world</hello>"
Whereas Body should be the 2nd book / thread (<Book code="BB"/>)

When posting the same test again, the correct outcome occurs -

Thread 1 -
"body is: <Book code="AA"/>"
XSLT is ran. Output of XSLT is <hello>World</hello>

Thread 2 -
"body is: <Book code="BB"/>"
XSLT is ran. Output of XSLT is <hello>World</hello>

Adding an extra "book" / reason to split will trigger the issue again, for the new 'book'.

So this appears to happen the first time a new reason to split (book for example) is used.
(changing other parts of the input doesn't produce this behaviour, only adding additional 'books' / nodes that need to be split)

When testing this with one input for the split (so 1 book) the body is blank.
When reposting this test, the body is as expected.

We also found that turning streamCache to true produced this error, regardless of whether body was being changed or not.

This also doesn't appear to apply to just XSLT either. Changing the body in anyway during the split seems to produce this issue.

So to summarise.

When using split.
If the ${body} is ever changed during the split (or streamCache=true) the body provided to the following threads of the split, is replaced with the amended body from the first thread.

thread 1 = body is as expected
thread 2 = body is not as expected (either blank or amended body from thread 1)

This only occurs on the first test.
Reposting the same payload gives expected results.

Hopefully this is enough information for somebody to reproduce the bug.
If you need anything else, please let me know.
Thanks

@steve-cdl steve-cdl added the bug Something isn't working label Jan 28, 2025
@jamesnetherton
Copy link
Contributor

Sounds like a generic Camel issue.

Thus probably better to report it here: https://issues.apache.org/jira/projects/CAMEL.

@davsclaus
Copy link
Contributor

Yes this is expected when using parallel, then 2 threads can operate at the same time on the same body. Saxon is not thread safe so dont do that. Or use the onPrepare to make a deep clone of the message body.

@steve-cdl
Copy link
Author

Sounds like a generic Camel issue.

Thus probably better to report it here: https://issues.apache.org/jira/projects/CAMEL.

Thanks for the response.

We were previously using Spring Boot and did not encounter this issue.
We are in the process of migrating to Quarkus as we have seen performance benefits, until we encountered this issue.

Therefore i do not believe this is a generic Camel issue as we cannot reproduce the issue on Spring Boot.

Thanks

@steve-cdl
Copy link
Author

steve-cdl commented Jan 29, 2025

Yes this is expected when using parallel, then 2 threads can operate at the same time on the same body. Saxon is not thread safe so dont do that. Or use the onPrepare to make a deep clone of the message body.

The documentation for Split says -

"Use a Splitter to break out the composite message into a series of individual messages, each containing data related to one item."

So my understanding is that when you split, at the start of each split the body is now the content of the split.
So using my earlier example -

If the input is

<Books>
    <Book code="AA"/>
    <Book code="BB"/>
</Books>

And I am splitting on Book.
At the beginning of every split I would expect the body to be the content of the split.

For example -

Split 1
body is <Book code="AA"/>

Split 2
body is <Book code="BB"/>

However, what I am finding is that if anything changes the body within the 'split' this is having an affect on other threads.

For example, changing the body to some text -

<route streamCache="false">
        <from uri="direct:tokenize-body-main-route"/>


        <split parallelProcessing="true">
            <tokenize token="Book" xml="true"/>
            <log loggingLevel="INFO" logName="xml" message="body is: ${body}"/>
            <setBody><simple>I am the body now</simple></setBody>
        </split>

    </route>

And posting a request of

<Books>
    <Book code="AA"/>
    <Book code="BB"/>
</Books>

gives the following log lines -

Split 1
body is: <Book code="AA"/>

Split 2

body is: <Books>
    <stuff>test</stuff>
    <Book code="AA"/>
    <Book code="BB"/>
</Books>

Note - the body of split 2 is the entire request payload and not the expected content of the 2nd split (book BB).

Also, reposting the exact same test will result in the expected behaviour

Split 1
body is: <Book code="AA"/>

Split 2
body is: <Book code="BB"/>

However, adding another 'Book' would show the error again.
But reposting that test, would show the expected the behaviour.

Hopefully this has helped explain it better, if you need more information please let me know.

Thanks

@jamesnetherton
Copy link
Contributor

We were previously using Spring Boot and did not encounter this issue.

With the same Camel version (4.9.0)?

@steve-cdl
Copy link
Author

We were previously using Spring Boot and did not encounter this issue.

With the same Camel version (4.9.0)?

Yes Spring Boot with Camel version 4.9.0 does not produce this issue.

Thanks

@steve-cdl
Copy link
Author

Just in case it helps. To expand on my point around streamCache

If you set streamCache="true" , this will produce the error, even if your route does not amend body at all

Route -

<route streamCache="true">
        <from uri="direct:tokenize-stream-main-route"/>


        <split parallelProcessing="true">
            <tokenize token="Book" xml="true"/>
            <log loggingLevel="INFO" logName="xml" message="body is: ${body}"/>
        </split>

    </route>

input -

<Books>
    <Book code="AA"/>
</Books>

Result -

body is:

Expected result -

body is: <Book code="AA"/>

Repost the same test again

Result -

body is: <Book code="AA"/>

When using Spring Boot, we do not encounter this issue when streamCache is true.

@jamesnetherton
Copy link
Contributor

Are you able to craft a demo application that reproduces the problem?

@steve-cdl
Copy link
Author

Hi, I have found the cause of this issue.

While creating the demo application to share i noticed I was not able to reproduce the issue locally.

Comparing my local application.properties with the one used by our app I found that the following parameter was the cause -

camel.main.exchange-factory=pooled

Removing this from application.properties has solved the issue.

Appreciate all of your help with this.
Happy for this to be closed / deemed not a bug.

Thanks

@jamesnetherton
Copy link
Contributor

Thanks for confirming 👍

@jamesnetherton jamesnetherton closed this as not planned Won't fix, can't repro, duplicate, stale Feb 3, 2025
@github-actions github-actions bot added this to the No fix/wont't fix milestone Feb 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants