Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JOSS review - performances #318

Open
Failxxx opened this issue Feb 3, 2025 · 7 comments
Open

JOSS review - performances #318

Failxxx opened this issue Feb 3, 2025 · 7 comments

Comments

@Failxxx
Copy link

Failxxx commented Feb 3, 2025

Hello,
We can see on the landing page of your project a graph showing the execution time of "parsing a volVectorField with 200k cells" using foamlib compared to pyfoam. Can you please tell us more about it?

  • How did you measure execution time?
  • What are the specifications of the machine on which you did the measurement?
  • Can you please provide instructions in order to reproduce it?

Sincerely,
Félix.

@Failxxx
Copy link
Author

Failxxx commented Feb 3, 2025

Related to: openjournals/joss-reviews/issues/7633

@gerlero
Copy link
Owner

gerlero commented Feb 4, 2025

We can see on the landing page of your project a graph showing the execution time of "parsing a volVectorField with 200k cells" using foamlib compared to pyfoam. Can you please tell us more about it?

@Failxxx sure. Thanks for asking!

How did you measure execution time?

I used Python's timeit as shown below.

What are the specifications of the machine on which you did the measurement?

Apple MacBook Air (M1, 2020) with 8 GB of RAM.

Can you please provide instructions in order to reproduce it?

Of course. Here's the script to reproduce the measurements:

#!/usr/bin/env python3

import timeit

from foamlib import FoamFieldFile
from PyFoam.RunDictionary.ParsedParameterFile import ParsedParameterFile

FoamFieldFile("U").internal_field = [[0.0, 0.0, 0.0]] * 200_000

print(min(timeit.repeat(lambda: FoamFieldFile("U").internal_field, number=1)))
print(min(timeit.repeat(lambda: ParsedParameterFile("U")["internalField"], number=1)))

@Failxxx
Copy link
Author

Failxxx commented Feb 7, 2025

Thank you very much.
I ran the script on my computer (AMD Ryzen™ 7 5825U with Radeon™ Graphics × 16, 2.0 to 4.5Ghz, 16GB RAM) :

(joss) felix@ThinkPad-Felix:~/Documents/joss$ python3 foamlib-test.py 
0.39394231400001445
4.032019166999817
(joss) felix@ThinkPad-Felix:~/Documents/joss$ 

I measured a speedup of ~10 (4.03/0.39), which is already impressive!

So I would recommend to mention the specifications of the computer that lead to the result you mention on the landing page of the project to avoid any misunderstanding.

What do you think @akashdhruv, @AndreWeiner and @paugier?

@AndreWeiner
Copy link

Personally, I think the speed comparison is not needed at all. Nobody (hopefully) writes large simulation data in ascii format. I don't remember the last time I ran any simulation with ascii output. A big advantage of foamlib over PyFoam and similar libraries is the ability to write fields in both binary and ascii; all the rest is secondary, in my opinion.
However, I agree that if performance benchmarks are shown, there should be a precise description of how the results were obtained.
Cheers

@gerlero
Copy link
Owner

gerlero commented Feb 7, 2025

@Failxxx @AndreWeiner thanks! I know the discussion on this is still ongoing, but here go some proposed changes anyway:

Let me know if these work for you, or if you'd rather have me get rid of the comparison or augment it in any way (e.g. add a bar for parsing a binary file?).

@paugier
Copy link

paugier commented Mar 3, 2025

It would also be useful to provide in the repo the code used for the benchmark.

For the record, fluidsimfoam is a bit faster (15%, at least here) for this benchmark:

import timeit

from foamlib import FoamFieldFile

from fluidsimfoam.foam_input_files import read_field_file

FoamFieldFile("U").internal_field = [[0.0, 0.0, 0.0]] * 200_000

print(min(timeit.repeat(lambda: FoamFieldFile("U").internal_field, number=1)))
print(min(timeit.repeat(lambda: read_field_file("U").get_array(), number=1)))

I agree that this is not a very representative benchmark because such data are usually saved in binary format.

@gerlero
Copy link
Owner

gerlero commented Mar 3, 2025

@paugier Thanks for the review!

It would also be useful to provide in the repo the code used for the benchmark.

Let me know if this works for you.

For the record, fluidsimfoam is a bit faster (15%, at least here) for this benchmark:

It is indeed ~13% faster on my computer too. Honestly, I didn't know about that package (even though I did searches for similar stuff before setting out to write the parser)... I've looked at the code and it seems like you're using Lark to parse, right? TBH I experimented a bit with Lark too a few months back, but then decided to keep using pyparsing and just rewrite the performance-critical parts using regular expressions. Looks like there's still some performance gains for foamlib's parsing on the table.

Seems like it would be interesting to find a way to standardize on and share a parsing grammar/library between fluidsimfoam and this package (and potentially, any other OpenFOAM-related Python packages) in the future.

This was referenced Mar 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants