Reorganize `sinkhorn_unbalanced`, improve convergence checks, and fix GPU issues #80

devmotion · 2021-05-27T18:57:55Z

This PR has a similar motivation as #79 and tries to fix some spurious test errors. However, I took the opportunity and rewrote large parts of the sinkhorn_unbalanced implementation and added more tests.

The PR

adds atol and rtol options for a more fine-grained convergence analysis and deprecates tol, similar to the other PR,
it changes the convergence check from 0.5 * (err_a + err_b) where err_a and err_b are the relative distance w.r.t. inf-norm of the last two iterates of a and b (same as in POT) to isapprox(vcat(a, b), vcat(a_old, b_old); atol, rtol) (more efficiently implemented) based on the 2-norm,
it adds a convergence_check argument to adjust how often the convergence is checked,
it renames max_iter to maxiter to be consistent with sinkhorn,
it deprecates the keyword arguments proxdiv_F1 and proxdiv_F2 and instead uses multiple dispatch,
it optimizes the implementation of sinkhorn_unbalanced and in particular gets rid of all unnecessary allocations and branches,
it extends the tests.

Of course, one could use a different norm in the convergence check - as mentioned in the other PR, all standard norms are equivalent. However, in contrast to the marginal check in sinkhorn where one deals with probability measures there is no intuitive motivation for the use of the 1-norm for the scaling factors. Moreover, it seems a bit surprising to weight the errors of a and b equally even though usually they are of different dimensions.

Edit: I just checked, and the rewrite also fixes GPU issues. Currently, one can't use sinkhorn_unbalanced with CUDA but with this PR it is possible to just run

julia> using CUDA, OptimalTransport

julia> M, N = 200, 250;

julia> mu = cu(fill(1/N, M));

julia> nu = cu(fill(1/N, N));

julia> C = cu(((x, y) -> abs2(x - y)).(rand(M), rand(1, N)));

julia> lambda1 = 0.4f0;

julia> lambda2 = 0.5f0;

julia> eps = 0.01f0;

julia> sinkhorn_unbalanced(mu, nu, C, lambda1, lambda2, eps)

test/runtests.jl

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

coveralls · 2021-05-27T19:03:07Z

Pull Request Test Coverage Report for Build 883952337

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

For more information on this, see Tracking coverage changes with pull request builds.
To avoid this issue with future PRs, see these Recommended CI Configurations.
For a quick fix, rebase this PR at GitHub. Your next report should be accurate.

Details

52 of 52 (100.0%) changed or added relevant lines in 1 file are covered.
27 unchanged lines in 1 file lost coverage.
Overall coverage increased (+2.4%) to 88.364%

Files with Coverage Reduction	New Missed Lines	%
src/OptimalTransport.jl	27	88.36%

Totals
Change from base Build 883949332:	2.4%
Covered Lines:	243
Relevant Lines:	275

💛 - Coveralls

src/OptimalTransport.jl

davibarreira · 2021-05-27T20:17:14Z

src/OptimalTransport.jl

-function sinkhorn_unbalanced2(μ, ν, C, λ1, λ2, ε; plan=nothing, kwargs...)
+function sinkhorn_unbalanced2(
+    μ, ν, C, λ1_or_proxdivF1, λ2_or_proxdivF2, ε; plan=nothing, kwargs...
+)


Shouldn't we rename to sinkhorn_unbalanced_cost?

Maybe. I still think the main problem is not the name of the functions but the amount of functions - they are all doing the same thing but for slightly different problems (many even for the same) and different algorithms. So the natural approach would be to be able to dispatch both on the problem and the algorithm, which would also solve the problem that #66 tries to address but doesn't fix in a general and extendable way.

In any case, I would suggest that both renaming and reorganization of functions should be done in a separate PR.

Yeah, you are right. It's better that we decide on the naming, and then submit in another PR.
" I still think the main problem is not the name of the functions but the amount of functions - they are all doing the same thing but for slightly different problems (many even for the same) and different algorithms." Why is this a problem?

I think it's a problem since if we have different functions for every combination of problem and algorithm (such as e.g. sinkhorn, sinkhorn_stabilized, sinkhorn_stabilized_epsscaling, sinkhorn_unbalanced etc.)

the API is unstructured and difficult to navigate for users,

it is very difficult to compose functionality (e.g. if I would like to use epsilon scaling with the unbalanced Sinkhorn algorithm I have to write a new function instead of just composing epsilon scaling with the unbalanced algorithm),

we completely neglect multiple dispatch which arguably is the biggest feature of Julia.

…nsport.jl into dw/sinkhorn_unbalanced

codecov-commenter · 2021-05-27T22:43:45Z

Codecov Report

Merging #80 (31dc3f6) into master (dfcc088) will increase coverage by 2.42%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master      #80      +/-   ##
==========================================
+ Coverage   85.93%   88.36%   +2.42%     
==========================================
  Files           1        1              
  Lines         256      275      +19     
==========================================
+ Hits          220      243      +23     
+ Misses         36       32       -4

Impacted Files	Coverage Δ
src/OptimalTransport.jl	`88.36% <100.00%> (+2.42%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dfcc088...31dc3f6. Read the comment docs.

devmotion added 3 commits May 27, 2021 20:26

Reorganize sinkhorn_unbalanced and improve convergence checks

4aa431c

Extend tests

55a6edd

Bump version

7b5e2cf

devmotion requested review from zsteve and davibarreira May 27, 2021 18:58

github-actions bot reviewed May 27, 2021

View reviewed changes

test/runtests.jl Outdated Show resolved Hide resolved

Fix format

b50289c

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

devmotion changed the title ~~Reorganize sinkhorn_unbalanced and improve convergence checks~~ Reorganize sinkhorn_unbalanced, improve convergence checks, and fix GPU issues May 27, 2021

davibarreira approved these changes May 27, 2021

View reviewed changes

devmotion and others added 5 commits May 28, 2021 00:33

Remove redundant information

cb14d9e

Explain rtol and atol more clearly

ebee41d

Merge branch 'dw/sinkhorn_unbalanced' of github.com:zsteve/OptimalTra…

6acb85e

…nsport.jl into dw/sinkhorn_unbalanced

Use term entropically regularized OT

eabbaad

Merge branch 'master' into dw/sinkhorn_unbalanced

85f1c30

devmotion and others added 4 commits May 28, 2021 02:01

Merge branch 'master' into dw/sinkhorn_unbalanced

c587490

Update Project.toml

980e9ab

Fix docstring

9585ccd

More docstring fixes

31dc3f6

zsteve approved these changes May 28, 2021

View reviewed changes

devmotion merged commit 2eee9f4 into master May 28, 2021

devmotion deleted the dw/sinkhorn_unbalanced branch May 28, 2021 07:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reorganize `sinkhorn_unbalanced`, improve convergence checks, and fix GPU issues #80

Reorganize `sinkhorn_unbalanced`, improve convergence checks, and fix GPU issues #80

devmotion commented May 27, 2021 •

edited

Loading

coveralls commented May 27, 2021 •

edited

Loading

davibarreira May 27, 2021

devmotion May 27, 2021

devmotion May 27, 2021

davibarreira May 27, 2021

devmotion May 27, 2021

codecov-commenter commented May 27, 2021 •

edited

Loading

Reorganize sinkhorn_unbalanced, improve convergence checks, and fix GPU issues #80

Reorganize sinkhorn_unbalanced, improve convergence checks, and fix GPU issues #80

Conversation

devmotion commented May 27, 2021 • edited Loading

coveralls commented May 27, 2021 • edited Loading

Pull Request Test Coverage Report for Build 883952337

Warning: This coverage report may be inaccurate.

Details

💛 - Coveralls

davibarreira May 27, 2021

Choose a reason for hiding this comment

devmotion May 27, 2021

Choose a reason for hiding this comment

devmotion May 27, 2021

Choose a reason for hiding this comment

davibarreira May 27, 2021

Choose a reason for hiding this comment

devmotion May 27, 2021

Choose a reason for hiding this comment

codecov-commenter commented May 27, 2021 • edited Loading

Codecov Report

Reorganize `sinkhorn_unbalanced`, improve convergence checks, and fix GPU issues #80

Reorganize `sinkhorn_unbalanced`, improve convergence checks, and fix GPU issues #80

devmotion commented May 27, 2021 •

edited

Loading

coveralls commented May 27, 2021 •

edited

Loading

codecov-commenter commented May 27, 2021 •

edited

Loading