-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Poisson solver #32
Poisson solver #32
Conversation
To get good performance (at scale) you need a preconditioner. The preconditioner is typically based on an actual matrix, e.g. a low order discretisation. I'm working on an implementation of this here 3decomp/poissbox#20, once I've got the preconditioner sorted I'll port it over to x3d2. The high-order Laplacian is working and a star-based preconditioner is implemented, but it's not currently effective. I don't think we want to implement CG ourselves though! Note that the "serial" aspect of the linked PR is only due to the TDMA implementation I'm using to keep things simple, the CG and AMG preconditioner are parallel, and the solver can be run using the low-order discretisation in parallel. |
For consistency of the discretisation the Laplacian operator should be built by combining the div and grad operators. |
The overall structure looks good. I like the simple calling code in the solve. Seems unlikely that the CG solver won't require some backend-specific stuff (so might need its own class) but the generic procedure can handle that case well, and can extend to other options if needed. |
At the current stage the codebase is only able to pick between two dummy functions, poisson_fft or poisson_cg at runtime via the function pointers. I think poisson_cg can then issue more sophisticated calls to PETSc for example. If you think this looks good I'm happy to leave poisson_cg as is so that you can port your stuff when ready. My plan was to complete only the fft solver in this PR so that we can test the codebase in full, but just wanted to implement a way to pick between two methods we'll definitely have in the codebase soon. |
This is very interesting, I wasn't expecting this at all. But I think the gradient and divergence subroutines in solver is not the exact ones you need, right? Because they switch between velocity grid and pressure grid. I think the iterative solver will require both gradient and divergence starting from the pressure grid and ending in the pressure grid. If so we'll need a new set of tdsops to support this operation and add two new grad and div subroutines later on. |
Actually we can access the backend stuff from the solver, all methods in solver do it already. For example a naive CG implementation would only require a laplacian and vector squares I guess.
But if we're doing the CG stuff via PETSc, then its very likely that we'll need a class to keep PETSc related complexity away from backends too. |
The pattern of execution is |
ecdb773
to
bcbaf18
Compare
The only remaining part is the reorder functions operate on complex arrays. I think I can get them working tomorrow. |
adbc5ab
to
a251fa9
Compare
This fixes #55 |
Just to let you know when reviewing, all the reorder and reshape functions in complex.f90 will be removed when we switch to cuFFTMp. Only thing I want to add before we merge is a periodic call to curl and divergence functions to report enstrophy and divergence of velocity field as the simulation goes on. We need a simple inner product function in the backends to finalise that. |
Can you mark these with a comment like "TODO remove after cuFFTMp" and create an issue for it? |
A call for last round of review folks! The input settings in the main file is arranged so that it produces the result I shared in the slack channel. Basically a TGV from 0 to 20 on a 256^3 grid. The output subroutine is not very good but for the time being useful to see if there is anything going bad with the TGV, because it prints the enstrophy. I suppose this will evolve a lot into more useful output stuff later.
They won't be being called once the cuFFTMp stuff is in and then like any redundant function they'll need to be removed. However we might want to compare performance of the two approaches so for a short while both the current FFT solver and cuFFTMp may coexist too. I think it'll be more clear with the cuFFTMp PR so not going ahead with TODO comments for now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall excellent. Needs comments + possible name changes for clarity & consistency. Might want tests but this PR is already large. Should be an issue if we decide not to put tests in here.
|
||
end function init | ||
|
||
subroutine fft_forward_cuda(self, f) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to have working tests for these before trying to implement cuFFTMP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@semi-h implement in this PR or create an issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, cuFFTMp provides 3D forward and inverse FFT transforms, and our functions here will only carry out the calls to cuFFTMp library so probably there is no value testing them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the forward/backward FFTs sure, no need to test individually. But we're currently not unit testing the Poisson solver at all. Since we have a working implementation, I think it would be easiest to implement a correct test of this working solver, then apply the same test to cuFFTMP.
If you don't already have one kicking around, the analytical test case I usually use for Poisson solve is solving
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor comments for consideration. Otherwise I agree with Jamie's points.
Adding for reference (I think we've discussed this in meetings) the direct evaluation of the Laplacian using high-order methods for 2nd derivatives on the pressure grid seems to work well (as a standalone solver). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Paul and Jamie's comments, just added a couple minor ones.
Last commit fixed most issues raised. There are a few comments about the reshape and complex_reorder functions, I admit that their naming scheme is terrible and there is a lot to do make them better, but there is a very good chance that cuFFTMp will work fine and they'll all be removed from the codebase entirely very soon. Because we expect their lifespan to be very short, I don't want to invest time to think about a proper naming scheme, adding comments, change the way we set threads/blocks for calling them, or testing them and willing to merge the PR as is. If we decide to keep them longer for any reason I'll be happy to implement all the suggestions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is nearly done! Looks like there are a few minor things still needing addressed, and a proper test of the Poisson solver implemented.
|
||
end function init | ||
|
||
subroutine fft_forward_cuda(self, f) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the forward/backward FFTs sure, no need to test individually. But we're currently not unit testing the Poisson solver at all. Since we have a working implementation, I think it would be easiest to implement a correct test of this working solver, then apply the same test to cuFFTMP.
If you don't already have one kicking around, the analytical test case I usually use for Poisson solve is solving
Its a good idea to have a unit test for the Poisson solver, but I don't want to implement it here as the PR is already large. Opened an issue to deal with it later on #74. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I'm happy with the TGV showing the Poisson solver works at this stage.
Poisson solver is the final part we need for running simulations with x3d2. The plan is to have an FFT based direct solver and use it whenever we can, and also have a CG based iterative solver. So I came up with this structure to have both methdos and decide which one to run at run time.
Few important considerations so that the structure makes more sense:
Closes #55