Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching improvements #3989

Open
wants to merge 27 commits into
base: master
Choose a base branch
from
Open

Caching improvements #3989

wants to merge 27 commits into from

Conversation

connorjward
Copy link
Contributor

@connorjward connorjward commented Jan 23, 2025

Needs firedrakeproject/fiat#132

  • Disk cache generated code
  • Only generate loopy code on one rank and broadcast to others
  • Use PYOP2_SPMD_STRICT in CI, which includes some extra checks
  • Various unsafe hashing bug fixes

Thanks to @pbrubeck for flagging this. Apparently these changes make a big difference.

* Disk cache generated code
* Only generate loopy code on one rank and broadcast to others
@connorjward connorjward requested a review from ksagiyam January 23, 2025 16:21
Copy link

github-actions bot commented Jan 23, 2025

TestsPassed ✅Skipped ⏭️Failed ❌
Firedrake real8124 ran7407 passed716 skipped1 failed

Copy link

github-actions bot commented Jan 23, 2025

TestsPassed ✅Skipped ⏭️Failed ❌
Firedrake complex8111 ran6562 passed1549 skipped0 failed

pyop2/global_kernel.py Outdated Show resolved Hide resolved
pyop2/global_kernel.py Outdated Show resolved Hide resolved
pyop2/caching.py Show resolved Hide resolved
firedrake/tsfc_interface.py Outdated Show resolved Hide resolved
JHopeCollins
JHopeCollins previously approved these changes Feb 3, 2025
Copy link
Member

@JHopeCollins JHopeCollins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of suggestions to make the comments clearer in the future, but seems sensible to me.

It does seem to prevent some of the errors we were seeing in asQ (or at least it prevents non-determinism between ranks from causing errors).

scripts/firedrake-run-split-tests Outdated Show resolved Hide resolved
firedrake/functionspaceimpl.py Show resolved Hide resolved
firedrake/functionspaceimpl.py Outdated Show resolved Hide resolved
pyop2/caching.py Show resolved Hide resolved
Comment on lines 318 to 326
def _make_interface_key(interface):
if interface:
# Passing interface here is a small hack done in patch.py. What
# really matters for caching is what is used in the 'dont_split' kwarg.
assert isinstance(interface, functools.partial)
assert interface.func is KernelBuilder
return tuple(map(hash_expr, interface.keywords["dont_split"]))
else:
return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling hash_expr element by element will not produce useful cache key here. If we want to hash dont_split, we should compute the indices of the elements of dont_split in (sub)form.coefficients().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I don't really understand what you mean. Why isn't it useful? It might be overkill but I don't think that it is necessarily wrong.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given a form with form.coefficients() = (f, g, h, l), say, what distinguishes dont_split = (f,) and dont_split = (g,) is the index of f and g in (f, g, h, l); i.e. (0,) and (1,), so we can not get useful cache key by just looking at the individual coefficients.

For instance, if we run:

mesh = UnitIntervalMesh(1)
V = FunctionSpace(mesh, "CG", 1)
f = Function(V, count=1000)
g = Function(V, count=2000)
print(hash_expr(f))
print(hash_expr(g))

we get:

877b3c5df999b747710667a087dece19123b8045b21ba174c593b75c3c87ea1168b2fec9f8ce0634834ac3a3aea52b25ad9038474b9ef8319deb63b73f4de2d0
877b3c5df999b747710667a087dece19123b8045b21ba174c593b75c3c87ea1168b2fec9f8ce0634834ac3a3aea52b25ad9038474b9ef8319deb63b73f4de2d0
``

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah you're quite right. I will fix that tomorrow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have fixed this.

@@ -87,6 +88,7 @@ jobs:
--install defcon \
--install gadopt \
--install asQ \
--package-branch fiat connorjward/fix-quadrature-hash \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
--package-branch fiat connorjward/fix-quadrature-hash \

Co-authored-by: Josh Hope-Collins <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants