Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reorganize sinkhorn_unbalanced, improve convergence checks, and fix GPU issues #80

Merged
merged 13 commits into from
May 28, 2021

Conversation

devmotion
Copy link
Member

@devmotion devmotion commented May 27, 2021

This PR has a similar motivation as #79 and tries to fix some spurious test errors. However, I took the opportunity and rewrote large parts of the sinkhorn_unbalanced implementation and added more tests.

The PR

  • adds atol and rtol options for a more fine-grained convergence analysis and deprecates tol, similar to the other PR,
  • it changes the convergence check from 0.5 * (err_a + err_b) where err_a and err_b are the relative distance w.r.t. inf-norm of the last two iterates of a and b (same as in POT) to isapprox(vcat(a, b), vcat(a_old, b_old); atol, rtol) (more efficiently implemented) based on the 2-norm,
  • it adds a convergence_check argument to adjust how often the convergence is checked,
  • it renames max_iter to maxiter to be consistent with sinkhorn,
  • it deprecates the keyword arguments proxdiv_F1 and proxdiv_F2 and instead uses multiple dispatch,
  • it optimizes the implementation of sinkhorn_unbalanced and in particular gets rid of all unnecessary allocations and branches,
  • it extends the tests.

Of course, one could use a different norm in the convergence check - as mentioned in the other PR, all standard norms are equivalent. However, in contrast to the marginal check in sinkhorn where one deals with probability measures there is no intuitive motivation for the use of the 1-norm for the scaling factors. Moreover, it seems a bit surprising to weight the errors of a and b equally even though usually they are of different dimensions.

Edit: I just checked, and the rewrite also fixes GPU issues. Currently, one can't use sinkhorn_unbalanced with CUDA but with this PR it is possible to just run

julia> using CUDA, OptimalTransport

julia> M, N = 200, 250;

julia> mu = cu(fill(1/N, M));

julia> nu = cu(fill(1/N, N));

julia> C = cu(((x, y) -> abs2(x - y)).(rand(M), rand(1, N)));

julia> lambda1 = 0.4f0;

julia> lambda2 = 0.5f0;

julia> eps = 0.01f0;

julia> sinkhorn_unbalanced(mu, nu, C, lambda1, lambda2, eps)

test/runtests.jl Outdated Show resolved Hide resolved
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@coveralls
Copy link

coveralls commented May 27, 2021

Pull Request Test Coverage Report for Build 883952337

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 52 of 52 (100.0%) changed or added relevant lines in 1 file are covered.
  • 27 unchanged lines in 1 file lost coverage.
  • Overall coverage increased (+2.4%) to 88.364%

Files with Coverage Reduction New Missed Lines %
src/OptimalTransport.jl 27 88.36%
Totals Coverage Status
Change from base Build 883949332: 2.4%
Covered Lines: 243
Relevant Lines: 275

💛 - Coveralls

@devmotion devmotion changed the title Reorganize sinkhorn_unbalanced and improve convergence checks Reorganize sinkhorn_unbalanced, improve convergence checks, and fix GPU issues May 27, 2021
src/OptimalTransport.jl Show resolved Hide resolved
src/OptimalTransport.jl Outdated Show resolved Hide resolved
function sinkhorn_unbalanced2(μ, ν, C, λ1, λ2, ε; plan=nothing, kwargs...)
function sinkhorn_unbalanced2(
μ, ν, C, λ1_or_proxdivF1, λ2_or_proxdivF2, ε; plan=nothing, kwargs...
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we rename to sinkhorn_unbalanced_cost?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe. I still think the main problem is not the name of the functions but the amount of functions - they are all doing the same thing but for slightly different problems (many even for the same) and different algorithms. So the natural approach would be to be able to dispatch both on the problem and the algorithm, which would also solve the problem that #66 tries to address but doesn't fix in a general and extendable way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case, I would suggest that both renaming and reorganization of functions should be done in a separate PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you are right. It's better that we decide on the naming, and then submit in another PR.
" I still think the main problem is not the name of the functions but the amount of functions - they are all doing the same thing but for slightly different problems (many even for the same) and different algorithms." Why is this a problem?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a problem since if we have different functions for every combination of problem and algorithm (such as e.g. sinkhorn, sinkhorn_stabilized, sinkhorn_stabilized_epsscaling, sinkhorn_unbalanced etc.)

  • the API is unstructured and difficult to navigate for users,
  • it is very difficult to compose functionality (e.g. if I would like to use epsilon scaling with the unbalanced Sinkhorn algorithm I have to write a new function instead of just composing epsilon scaling with the unbalanced algorithm),
  • we completely neglect multiple dispatch which arguably is the biggest feature of Julia.

@codecov-commenter
Copy link

codecov-commenter commented May 27, 2021

Codecov Report

Merging #80 (31dc3f6) into master (dfcc088) will increase coverage by 2.42%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #80      +/-   ##
==========================================
+ Coverage   85.93%   88.36%   +2.42%     
==========================================
  Files           1        1              
  Lines         256      275      +19     
==========================================
+ Hits          220      243      +23     
+ Misses         36       32       -4     
Impacted Files Coverage Δ
src/OptimalTransport.jl 88.36% <100.00%> (+2.42%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dfcc088...31dc3f6. Read the comment docs.

@devmotion devmotion merged commit 2eee9f4 into master May 28, 2021
@devmotion devmotion deleted the dw/sinkhorn_unbalanced branch May 28, 2021 07:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants