Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ST_Explode performance issue #1209

Open
ebocher opened this issue Sep 1, 2021 · 6 comments
Open

ST_Explode performance issue #1209

ebocher opened this issue Sep 1, 2021 · 6 comments

Comments

@ebocher
Copy link
Member

ebocher commented Sep 1, 2021

It seems that the ST_Explode function is badly implemented.

In debug mode the test https://github.com/orbisgis/h2gis/blob/master/h2gis-functions/src/test/java/org/h2gis/functions/spatial/SpatialFunctionTest.java#L126 shows that the function is visited 3 times instead of 1.

Don't know if we have the same bug on H2GIS 1.5.

@nicolas-f @SPalominos @gpetit @j3r3m1 @ELSW56

@j3r3m1
Copy link
Contributor

j3r3m1 commented Sep 2, 2021

Sounds like a quite huge amount of time saved if solved !! =)

@nicolas-f
Copy link
Member

Yes it is visited multiple times because it needs to get the fields name. The optimisation has already done as it will effectively iterate over the source table only once

@nicolas-f
Copy link
Member

nicolas-f commented Sep 2, 2021

http://www.h2database.com/html/features.html#user_defined_functions

A function that returns a result set can be used like a table. 
However, in this case the function is called at least twice: 
first while parsing the statement to collect the column names (with parameters set to null where not known at compile time). 
And then, while executing the statement to get the data (maybe multiple times if this is a join).

@ebocher
Copy link
Member Author

ebocher commented Sep 2, 2021

Thanks @nicolas-f
The iteration is done each time H2 calls the function. So in this case (ST_Explode), the parseRow() method is calling 3 times.
Don't know if it's an H2 limitation but I'd like to call the parseRow() only once by SQL query.

@ebocher
Copy link
Member Author

ebocher commented Sep 6, 2021

@katzyn
We use extensively the table function to process large geometry table. ST_Explode to explode collection of geometries to single one or ST_MakeGrid to build a square grid on a geometry bounding box are both examples.
I'm looking for a way or workaround to visit only once the table functions. I understand that the function is called at least twice but I'd like to know if it's possible to use the same behaviour as aggregate function that offers a init and getResult methods.

@katzyn
Copy link

katzyn commented Sep 6, 2021

External table value functions in H2 always had various defective by design implementations and fixes for some issues broke other things. They need a brand new API designed in a some reasonable way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants