decoding trees? #3

xinyue96 · 2021-05-06T22:22:01Z

The paper says that to decode the binary tree from the embedding, a top-down greedy approach is used.
However from the code, it seems that still a bottom up approach is used based on the angle similarity matrix, is that correct?

Say if I want to obtain a hierarchical clustering on the nodes given the final Poincare embeddings, is doing a single linkage agglomerative clustering using the angle similarity matrix between normalized embedding points equivalent to the decoding tree algorithm implemented in this code?

albertfgu · 2021-06-03T00:38:21Z

Hi,

The paper's theory is based on a bottom-up decoding approach using hyperbolic LCAs (Algorithm 1), but also mentions that a top-down greedy approximation with angle similarity matrices can be used (Section 5). Although the former approach is slower in theory, we optimized it substantially so that it is fast in practice even on the largest graphs we have.
The code is called here:

HypHC/model/hyphc.py

Line 74 in 1df9f99

parents = sl_from_embeddings(leaves_embeddings, sim_fn)

More specifically, Algorithm 1 is equivalent to doing a single linkage clustering where similarities are given by hyperbolic LCA depths (the angle similarity matrix was a heuristic for the other approximation). These hyperbolic LCA depths are also monotonic with the simple dot product of the (Euclidean, unnormalized) embeddings, hence we can use the simpler dot product similarity function:

HypHC/model/hyphc.py

Line 70 in 1df9f99

sim_fn = lambda x, y: torch.sum(x * y, dim=-1)

So your question is essentially right except you want to use the unnormalized (not normalized) dot product similarity matrix.

Hope this helps!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

decoding trees? #3

decoding trees? #3

xinyue96 commented May 6, 2021

albertfgu commented Jun 3, 2021

decoding trees? #3

decoding trees? #3

Comments

xinyue96 commented May 6, 2021

albertfgu commented Jun 3, 2021