Skip to content
This repository has been archived by the owner on Nov 1, 2024. It is now read-only.

Initial Release #1

Closed
wants to merge 27 commits into from
Closed

Initial Release #1

wants to merge 27 commits into from

Conversation

avik-pal
Copy link
Member

@avik-pal avik-pal commented Nov 17, 2023

  • Basic Operations
    • Broadcasting
      • Type Stability?
    • Reduction Operations
    • Unary Operations
  • Linear Algebra
    • QR
    • LU
    • Cholesky
    • Direct Ldiv
    • Batched Matrix Multiplication
  • CUDA
    • QR
      • Batched Solve?
      • Square
      • Long Rectangle
    • LU
      • Batched Solve?
    • Cholesky
    • Direct Ldiv
      • Square
      • Long Rectangle
      • Wide Rectangle
    • Batched Matrix Multiply
  • Tests
  • Aqua
  • Compatibility with LinearSolve.jl
    • Krylov GMRES gives incomplete results
  • Basic Usage example with NonlinearSolve.jl

@avik-pal avik-pal force-pushed the ap/basic_impl branch 2 times, most recently from ce02274 to ecd53e7 Compare November 18, 2023 08:18
@avik-pal
Copy link
Member Author

For LinearSolve.jl, LU and QR should just work, with a fallback for \ looping over each batch. To make this efficient for GPU arrays, we need to use the batched solvers directly -- surprisingly CUBLAS and CUSOLVER have APIs for direct batched linear solves but don't provide APIs to use the batched QR and such to do the linear solve efficiently.

Currently I make the assumption that the batchsize of A and b must match. But it is easy to generalize. @ChrisRackauckas the case that you mentioned where there is 1 A and N bs, it is easy to generalize and use trsm vs trsv on our end itself. So all users need to do is:

LinearProblem(BatchedArray(rand(4, 4, 1)), BatchedArray(4, 16))  # Single `A` but 16 - `b`s

@avik-pal avik-pal force-pushed the ap/basic_impl branch 2 times, most recently from a8477e1 to 95cbb1c Compare November 21, 2023 01:09
Copy link

codecov bot commented Nov 21, 2023

Welcome to Codecov 🎉

Once merged to your default branch, Codecov will compare your coverage reports and display the results in this comment.

Thanks for integrating Codecov - We've got you covered ☂️

@avik-pal avik-pal force-pushed the ap/basic_impl branch 3 times, most recently from 64b5930 to 6704c42 Compare November 21, 2023 02:35
@avik-pal
Copy link
Member Author

avik-pal commented Nov 28, 2023

SimpleNonlinearSolve.jl example usage

using BatchedArrays, SimpleNonlinearSolve

u0 = BatchedArray(rand(3, 5))

prob1 = NonlinearProblem((u, p) -> u .^ 2 .- p, u0, 2.0)

solve(prob1, SimpleBroyden())

solve(prob1, SimpleDFSane())

solve(prob1, SimpleLimitedMemoryBroyden(; threshold = 2))

solve(prob1, SimpleNewtonRaphson())

solve(prob1, SimpleKlement())

solve(prob1, SimpleHalley())

I am leaving out TR for now since there is a potential correctness issue that needs some careful investigation. As a summary of this, branching is almost impossible to handle nicely if there are conditional computations inside the branch.

Fun part: Methods using Jacobian will be much faster using BatchedArrays since we can automatically color and propagate all the batch duals together. So the current SimpleNewtonRaphson with BatchedArrays is faster than the pre 1.0 BatchedSimpleNewtonRaphson

@avik-pal avik-pal closed this Mar 13, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant