Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

decoding trees? #3

Open
xinyue96 opened this issue May 6, 2021 · 1 comment
Open

decoding trees? #3

xinyue96 opened this issue May 6, 2021 · 1 comment

Comments

@xinyue96
Copy link

xinyue96 commented May 6, 2021

The paper says that to decode the binary tree from the embedding, a top-down greedy approach is used.
However from the code, it seems that still a bottom up approach is used based on the angle similarity matrix, is that correct?

Say if I want to obtain a hierarchical clustering on the nodes given the final Poincare embeddings, is doing a single linkage agglomerative clustering using the angle similarity matrix between normalized embedding points equivalent to the decoding tree algorithm implemented in this code?

@albertfgu
Copy link
Contributor

Hi,

The paper's theory is based on a bottom-up decoding approach using hyperbolic LCAs (Algorithm 1), but also mentions that a top-down greedy approximation with angle similarity matrices can be used (Section 5). Although the former approach is slower in theory, we optimized it substantially so that it is fast in practice even on the largest graphs we have.
The code is called here:

parents = sl_from_embeddings(leaves_embeddings, sim_fn)

More specifically, Algorithm 1 is equivalent to doing a single linkage clustering where similarities are given by hyperbolic LCA depths (the angle similarity matrix was a heuristic for the other approximation). These hyperbolic LCA depths are also monotonic with the simple dot product of the (Euclidean, unnormalized) embeddings, hence we can use the simpler dot product similarity function:

sim_fn = lambda x, y: torch.sum(x * y, dim=-1)

So your question is essentially right except you want to use the unnormalized (not normalized) dot product similarity matrix.

Hope this helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants