Add full support for fetch size in JPA #696

jborgers · 2025-01-14T20:18:10Z

As a Java performance tuner, I like to have full fetch size support in JPA, just like Hibernate API offers with:
Query setFetchSize(int fetchSize)

Obviously, for thousands of rows, fetching all in one round trip in stead of 10 at a time is much faster.

The current way with a hint is lousy and difficult to use. Gavin agrees that is sucks;-)

jborgers · 2025-01-14T21:14:30Z

In my Java performance trainings, I explain the working of jdbc fetch size. And people then ask: why is it so difficult in JPA? And I don't have an answer..

gavinking · 2025-01-14T21:40:58Z

Correct me if I'm wrong, but is it the case that essentially every JDBC driver except for Oracle defaults to retrieving all rows (which is typically the right thing for JPA), but that Oracle has an absurdly low default fetch size of 10?

gavinking · 2025-01-14T21:51:51Z

Oh, from my experiments it looks like SQL Server defaults to 128.

gavinking · 2025-01-15T08:51:59Z

And people then ask: why is it so difficult in JPA?

So I guess my question in return is: why doesn't it work to just set this globally using hibernate.jdbc.fetch_size or whatever the equivalent thing in EclipseLink is? I guess I would just set this once to a very large number and then I would not usually need to change it at the level of an individual Query.

jborgers · 2025-01-15T14:27:03Z

Good question. The thing is, in case you don't need all the result rows, you are over-allocating memory and over-fetching rows. And with that setting, you do that for all the queries in your app/service. Say, you do pagination by the user of 25 rows on a page, this can be implemented in Hibernate nicely and efficiently with:

Query q = session.getNamedQuery("yourQuery");
 q.setFirstResult(getOffset());
 q.setMaxResults(getResultsPerPage());
 q.setFetchSize(getResultsPerPage());
 return q.list();

This will be quick. In case of thousands of rows, and a very large fetch size, the first page will take much longer to load and unnecessarily with wasteful processing, in case the user just chooses to see one page.

We have documented that here: too-many-returned-rows-or-roundtrips (feedback welcome)

gavinking · 2025-01-15T14:36:26Z

The thing is, in case you don't need all the result rows, you are over-allocating memory and over-fetching rows.

Well, no, that's not right.

JPA fully-instantiates the a List with the complete results of the query anyway. Setting a smaller JDBC fetch size isn't going to change the size of the result list. It's only going to increase the number of round trips. It can only make things worse, never better.
In JPA we control the size of the result list using setMaxResults(). We don't use the JDBC fetch size for that.

Essentially, in any multi-user system, we almost always want to avoid leaving resources open on the database side after we return from a round trip.

This will be quick. In case of thousands of rows, and a very large fetch size, the first page will take much longer to load and unnecessarily with wasteful processing, in case the user just chooses to see one page.

Nononononononnonono that's not how pagination works at all!

We use offset and limit or fetch. We don't leave a cursor open!

gavinking · 2025-01-15T14:46:13Z

feedback welcome

When you read rows from a ResultSet, under the hood JDBC will fetch 10 rows at a time

AFAICT, this is only true for Oracle, FTR.

Be aware! that setFetchSize and setMaxResults are still needed because internally Hibernate calls q.list() without these optimizations. Adding these optimizes for the only valid number of results: 1 and it protects against memory overload of wrong queries while still allowing the exception to be thrown with maxResults.

Nonononono that's just not right.

If you call uniqueResult() or getSingleResult(), and there is more than one object returned, that would be an error, and Hibernate would throw an exception, and you don't need to optimize that case, you need to fix your broken query!
Worse, if your root entity has collections that you're eagerly fetching using join fetch or whatever, setting the JDBC fetch size to 2 will make the query much slower, not faster!

You do not, ever need to set a fetch size when calling uniqueResult() or getSingleResult(), and if you do you can only possibly harm performance, never improve it.

However, setting the fetch size for each query individually is the better solution

No, that's bad advice except in very special cases.

The only case I can think of where this advice is correct is if you're using ScrollableResults or getResultStream(), but my strong advice is to steer well away from those unless you really know what you're doing.

gavinking · 2025-01-15T14:51:14Z

Look, if were so important to futz about directly with the fetch size for every query, there's no way that JPA would have made it through 20 years of existence without us adding it and without anyone even opening a feature request for it until now.

I'm not saying "no" to the feature, because, well, Hibernate has it obviously, and even though it's not much used, I'm sure the need for it arises occasionally. (Apparently we use it in Hibernate Search for batch processing entities, for example.)

But it sounds like all you need is to set hibernate.jdbc.fetch_size to some sensible number (much bigger than 10).

jborgers · 2025-01-15T15:22:21Z

Thanks for the feedback! Clarifying, I will update the documentation.

For uniqueResult() or getSingleResult() we had a case of a broken query which caused an out of memory error because of the q.list() call, and we didn't get to see the hibernate exception. This is something we don't like in production. But I agree that setting fetch size is also not a proper solution.

Wouldn't the implementation be improved by using setMaxResults()?

gavinking · 2025-01-15T15:46:09Z

We had a case of a broken query which caused an out of memory error because of the q.list() call, and we didn't get to see the hibernate exception

It would have to be a pretty massive result set before you would get an OOME!

And how is setting the JDBC fetch size going to help with that? Hibernate is still going to eagerly read all the rows from the database, because Hibernate doesn't know until after it's materialized the whole object graph that you have more than one root entity in it.

Wouldn't the implementation be improved by using setMaxResults()?

That would break any query which fetches a collection.

In principle we could append an unnecessary limit clause if the query doesn't fetch a collection, even though you've just told us that the query returns one root entity. But, well, no. Yuck, Let's not do that. :-)

If you want to call setMaxResults(1), you can, I can't stop you, but it's much better to test your queries properly at development time instead of letting them do arbitrary things which might blow up in production.

gavinking · 2025-01-15T15:50:09Z

Look, honestly, it really looks to me like you're overthinking this.

People get into trouble when they think they can do something better/different to what has already been working perfectly well in millions of programs for more than two decades.

jborgers · 2025-01-15T17:24:48Z

Okay, thanks for your remarks.
So, basically you are saying:

Setting fetch size makes sense especially for Oracle because it's default is so ridiculously low: 10. A more sensible value would be 128 (Sql Server) or higher.
Since typically (in a multi-user system) you create and return the whole list (or limited/maxed) of results of a query to the user, and close the cursor, there is no need for a fetch size smaller than all (or limited/maxed) query result rows to save data-fetching or memory (case: pagination by the user)
Easy solution would be to just set hibernate.jdbc.fetch_size to a high value
Only in exceptional cases like batch processing with huge numbers of rows, it may be useful to use ScrollableResults or getResultStream() and set fetch size per query to control and optimize in detail.
Hibernate will not save you from broken queries, make sure to test properly. In case of the risk of too many result rows, even when it is supposed to be 1, use setMaxResults() to safeguard.

Do you agree?

gavinking · 2025-01-15T17:33:54Z

Yep, exactly.

(None of that means we shouldn't provide a Query.setFetchSize() method in JPA; I'm only saying why it's rarely used.)

jborgers · 2025-01-15T17:35:05Z

Ah, that's a quick response :-) I just updated it.

jborgers · 2025-01-15T19:15:42Z

Note that DB2 jdbc driver seems to have a default fetch size of 32.

gavinking · 2025-01-16T15:44:27Z

Note that DB2 jdbc driver seems to have a default fetch size of 32.

How did you determine this?

On Db2 LUW, Statement.getFetchSize(), PreparedStatement.getFetchSize(), and ResultSet.getFetchSize() all return 0 for me.

jborgers · 2025-01-16T16:31:08Z

Good point, I am actually confused. I found 32 here: https://community.jaspersoft.com/knowledgebase/best-practices/jdbc-fetch-size-affecting-adhoc-sample-data/ And https://venkatsadasivam.com/2009/02/01/jdbc-performance-tuning-with-optimal-fetch-size/ But a good source says unlimited: https://vladmihalcea.com/resultset-statement-fetching-with-jdbc-and-hibernate/ And in recent performance tests we did with db2 on mainframe, setting it to 200 improved the query performance significantly. Op do 16 jan 2025, 16:45 schreef Gavin King ***@***.***>:

…

Note that DB2 jdbc driver seems to have a default fetch size of 32. How did you determine this? On Db2 LUW, Statement.getFetchSize(), PreparedStatement.getFetchSize(), and ResultSet.getFetchSize() all return 0 for me. — Reply to this email directly, view it on GitHub <#696 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AF3TVWYOTA5NXHG5XEZLMDT2K7HXLAVCNFSM6AAAAABVFWJZHWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOJWGA3DSMJUGI> . You are receiving this because you authored the thread.Message ID: ***@***.***>

jborgers · 2025-01-16T18:11:24Z

And this advice seems strange if the default fetch size is unlimited: https://www.ibm.com/docs/en/iis/11.7?topic=optimization-fetch-size-values Op do 16 jan 2025, 17:30 schreef Jeroen Borgers ***@***.***>:

…

Good point, I am actually confused. I found 32 here: https://community.jaspersoft.com/knowledgebase/best-practices/jdbc-fetch-size-affecting-adhoc-sample-data/ And https://venkatsadasivam.com/2009/02/01/jdbc-performance-tuning-with-optimal-fetch-size/ But a good source says unlimited: https://vladmihalcea.com/resultset-statement-fetching-with-jdbc-and-hibernate/ And in recent performance tests we did with db2 on mainframe, setting it to 200 improved the query performance significantly. Op do 16 jan 2025, 16:45 schreef Gavin King ***@***.***>: > Note that DB2 jdbc driver seems to have a default fetch size of 32. > > How did you determine this? > > On Db2 LUW, Statement.getFetchSize(), PreparedStatement.getFetchSize(), > and ResultSet.getFetchSize() all return 0 for me. > > — > Reply to this email directly, view it on GitHub > <#696 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AF3TVWYOTA5NXHG5XEZLMDT2K7HXLAVCNFSM6AAAAABVFWJZHWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOJWGA3DSMJUGI> > . > You are receiving this because you authored the thread.Message ID: > ***@***.***> >

gavinking · 2025-01-16T20:51:33Z

I found 32 here:

I mean, these are just some blogs from > 10 years ago. People post wrong information on blogs all the time.

db2 on mainframe

Yeah, well, again, I'm not sure, because that's a different product to Db2 LUW, so perhaps the defaults are different.

And this advice seems strange if the default fetch size is unlimited:

That's the documentation for InfoSphere Information Server you linked to. I have no idea what InfoSphere Information Server is, but I don't think it's Db2.

I think I'm going to just assume that the Db2 JDBC driver isn't lying to me and that the default is indeed 0 i.e. unlimited at least on LUW.

jborgers · 2025-01-17T14:09:31Z

That sounds like a reasonable assumption :-)

DB2 on mainframe may be different indeed; with the app and db2 on the same LPAR, roundtrips are much cheaper than usual, on Linux.

AleksNo · 2025-01-17T21:57:27Z

Maybe it is a good idea to collect very common options like fetchsize, batchsize, batchtype (IN, JOIN, Subquery, EXISTS etc.), cache size etc. and make them a part of the API.

jborgers mentioned this issue Jan 16, 2025

Update doc on jdbc fetch size jborgers/PMD-jPinpoint-rules#475

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add full support for fetch size in JPA #696

Add full support for fetch size in JPA #696

jborgers commented Jan 14, 2025 •

edited

Loading

jborgers commented Jan 14, 2025 •

edited

Loading

gavinking commented Jan 14, 2025

gavinking commented Jan 14, 2025

gavinking commented Jan 15, 2025

jborgers commented Jan 15, 2025 •

edited

Loading

gavinking commented Jan 15, 2025 •

edited

Loading

gavinking commented Jan 15, 2025

gavinking commented Jan 15, 2025

jborgers commented Jan 15, 2025 •

edited

Loading

gavinking commented Jan 15, 2025 •

edited

Loading

gavinking commented Jan 15, 2025

jborgers commented Jan 15, 2025 •

edited

Loading

gavinking commented Jan 15, 2025

jborgers commented Jan 15, 2025

jborgers commented Jan 15, 2025

gavinking commented Jan 16, 2025

jborgers commented Jan 16, 2025 via email

jborgers commented Jan 16, 2025 via email

gavinking commented Jan 16, 2025

jborgers commented Jan 17, 2025

AleksNo commented Jan 17, 2025 •

edited

Loading

Add full support for fetch size in JPA #696

Add full support for fetch size in JPA #696

Comments

jborgers commented Jan 14, 2025 • edited Loading

jborgers commented Jan 14, 2025 • edited Loading

gavinking commented Jan 14, 2025

gavinking commented Jan 14, 2025

gavinking commented Jan 15, 2025

jborgers commented Jan 15, 2025 • edited Loading

gavinking commented Jan 15, 2025 • edited Loading

gavinking commented Jan 15, 2025

gavinking commented Jan 15, 2025

jborgers commented Jan 15, 2025 • edited Loading

gavinking commented Jan 15, 2025 • edited Loading

gavinking commented Jan 15, 2025

jborgers commented Jan 15, 2025 • edited Loading

gavinking commented Jan 15, 2025

jborgers commented Jan 15, 2025

jborgers commented Jan 15, 2025

gavinking commented Jan 16, 2025

jborgers commented Jan 16, 2025 via email

jborgers commented Jan 16, 2025 via email

gavinking commented Jan 16, 2025

jborgers commented Jan 17, 2025

AleksNo commented Jan 17, 2025 • edited Loading

jborgers commented Jan 14, 2025 •

edited

Loading

jborgers commented Jan 14, 2025 •

edited

Loading

jborgers commented Jan 15, 2025 •

edited

Loading

gavinking commented Jan 15, 2025 •

edited

Loading

jborgers commented Jan 15, 2025 •

edited

Loading

gavinking commented Jan 15, 2025 •

edited

Loading

jborgers commented Jan 15, 2025 •

edited

Loading

AleksNo commented Jan 17, 2025 •

edited

Loading