You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm working on a node classification problem which estimates conditional probability $P(N=i|G, q)$ where:
$G$ represents graph data (including node matrix);
$q \in R^n$ is an external query parameter affecting classification probabilities (could be seen as a global graph attribute or just contextual information).
This is an inductive learning setup where each training instance consists of a temporal, fully-connected graph $G$ with $n$ nodes. The tensor dimensions are:
Node attributes: (N, C, T) where N=nodes, C=channels, T=time steps
Query parameter: (P, N=1, F, T=1) where P=number of query points, F=query feature dimension
P varies during both training and inference, while it is usually <10 during training it can reach thousands during inference
I am wondering how I can incorporate $q$ in my model in an efficient way.
So far, I tried to create P artificial copies of the graph $G$ (including adjusting edge_index) and concatenate q to x as recommended here, obtaining the new x of shape (P*N, C+F, T) after concatenation. While broadcasting with .view method doesn't occupy additional memory, the used memory blew up during the pass through GATv2Conv layer (probably due to the multiplicated number of edges and the associated computations). Therefore, this is an option when P is a small number but is prohibitive when P goes to thousands. Interestingly, during testing, it occurred that concatenating q to x before the graph layer was beneficial in comparison to concatenating it after.
Alternatively, I'm considering if representing it as HeteroData object with $q$ forming a set of P nodes of the second type could be a solution here. My understanding is that this will add P*N additional edges as well, but still significantly less than the explicit copying described above). Could it be the way to go in this type of problem?
Or maybe I'm just overthinking this and there is a better way to perform this computation efficiently, besides simply letting the graph layer work on x alone and blending it with (embedded) q afterwards?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I'm working on a node classification problem which estimates conditional probability$P(N=i|G, q)$ where:
This is an inductive learning setup where each training instance consists of a temporal, fully-connected graph$G$ with $n$ nodes. The tensor dimensions are:
(N, C, T)
where N=nodes, C=channels, T=time steps(P, N=1, F, T=1)
where P=number of query points, F=query feature dimensionI am wondering how I can incorporate$q$ in my model in an efficient way.
So far, I tried to create$G$ (including adjusting
P
artificial copies of the graphedge_index
) and concatenateq
tox
as recommended here, obtaining the newx
of shape(P*N, C+F, T)
after concatenation. While broadcasting with.view
method doesn't occupy additional memory, the used memory blew up during the pass through GATv2Conv layer (probably due to the multiplicated number of edges and the associated computations). Therefore, this is an option when P is a small number but is prohibitive when P goes to thousands. Interestingly, during testing, it occurred that concatenatingq
tox
before the graph layer was beneficial in comparison to concatenating it after.Alternatively, I'm considering if representing it as$q$ forming a set of
HeteroData
object withP
nodes of the second type could be a solution here. My understanding is that this will addP*N
additional edges as well, but still significantly less than the explicit copying described above). Could it be the way to go in this type of problem?Or maybe I'm just overthinking this and there is a better way to perform this computation efficiently, besides simply letting the graph layer work on
x
alone and blending it with (embedded)q
afterwards?Beta Was this translation helpful? Give feedback.
All reactions