You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After going through this code and the RNNLogic paper, we had a few questions regarding the implementation of the paper uploaded here. In what follows, the "old code" refers to the code inside the codes folder in the repo, which contains implementation for RNNLogic. The "new code" refers to the code inside the RNNLogic+ folder, which contains implementation for both RNNLogic and RNNLogic+.
The old code trains separate models for each relation. This implies
that all the EM iterations for each relation are done one after the
other. There is a common rule generator for all relations, but
separate predictors are trained for each relation serving as a rule
head. On the other hand, the new code trains on all the relations
together for both the RNNLogic and RNNLogic+ portion of the model.
There is a common rule generator as well as predictor for all
relations. The training data used to train these models have
interspersed relations. Could you please confirm if these approaches
are equivalent?
Different optimization equations have been used to train the
RNNLogic predictor in the old and new code. The old code appears to be
using a type of negative sampling (line 634 - 644 in
model_rnnlogic.py). On the other hand, the new code uses a different
optimization equation (line 86- 96 in trainer.py). Since the loss
functions are different in both cases, can we expect the same behavior
from both versions of the code?
In the old code, several passes are made over the training data
while training the predictor, after which the top k rules with best
scores according to the H(rule) metric are passed to the generator.
The generator is then trained to generate these rules. On the other
hand, in the RNNLogic implementation used in the new code, only one
pass is made over the training data to train the predictor in a given
EM iteration. Moreover, the H(rule) is computed for all rules and used
to train the generator, which trains on all these rules. Could you
please confirm if these approaches are equivalent?
Could you please tell us the significance of the pseudo-groundings
used in the old code? The paper does not contain details regarding the
advantage of using them, and the new code has not used them either. Is
it simply a heuristic for generating a score in case no groundings are
found using the rules generated by the generator; given a query head
and
rule? Do you have results for how the presence and absence of
pseudo-groundings in the code can affect the final performance of the
model?
The code for RNNLogic+ with embeddings (which produces best results
as reported in the paper) is not present in the Github repository.
Both RotatE scores and the knowledge graph embeddings used in the
scoring function for RNNLogic+ have not been implemented. Would you be
releasing the code for this version of the model in the future?
Could you also provide the specifications of the environment in
which you performed your experiments (with training times)? Our GPUs
are running out of memory when we try to run the old code with
pseudo-groundings turned on. It would also be helpful if you could
provide us with trained models (dumped into a pickle or even the final
.pth files in the workspace folder generated after a run of the old
code) or the final set of high-quality rules generated after training
RNNLogic on the datasets mentioned in the paper with optimal
hyperparameters.
We are looking forward to your response to the queries mentioned above.
Thanks in advance!
Regards,
Ananjan Nandi
Navdeep Kaur
The text was updated successfully, but these errors were encountered:
Hi,
After going through this code and the RNNLogic paper, we had a few questions regarding the implementation of the paper uploaded here. In what follows, the "old code" refers to the code inside the codes folder in the repo, which contains implementation for RNNLogic. The "new code" refers to the code inside the RNNLogic+ folder, which contains implementation for both RNNLogic and RNNLogic+.
The old code trains separate models for each relation. This implies
that all the EM iterations for each relation are done one after the
other. There is a common rule generator for all relations, but
separate predictors are trained for each relation serving as a rule
head. On the other hand, the new code trains on all the relations
together for both the RNNLogic and RNNLogic+ portion of the model.
There is a common rule generator as well as predictor for all
relations. The training data used to train these models have
interspersed relations. Could you please confirm if these approaches
are equivalent?
Different optimization equations have been used to train the
RNNLogic predictor in the old and new code. The old code appears to be
using a type of negative sampling (line 634 - 644 in
model_rnnlogic.py). On the other hand, the new code uses a different
optimization equation (line 86- 96 in trainer.py). Since the loss
functions are different in both cases, can we expect the same behavior
from both versions of the code?
In the old code, several passes are made over the training data
while training the predictor, after which the top k rules with best
scores according to the H(rule) metric are passed to the generator.
The generator is then trained to generate these rules. On the other
hand, in the RNNLogic implementation used in the new code, only one
pass is made over the training data to train the predictor in a given
EM iteration. Moreover, the H(rule) is computed for all rules and used
to train the generator, which trains on all these rules. Could you
please confirm if these approaches are equivalent?
Could you please tell us the significance of the pseudo-groundings
used in the old code? The paper does not contain details regarding the
advantage of using them, and the new code has not used them either. Is
it simply a heuristic for generating a score in case no groundings are
found using the rules generated by the generator; given a query head
and
rule? Do you have results for how the presence and absence of
pseudo-groundings in the code can affect the final performance of the
model?
The code for RNNLogic+ with embeddings (which produces best results
as reported in the paper) is not present in the Github repository.
Both RotatE scores and the knowledge graph embeddings used in the
scoring function for RNNLogic+ have not been implemented. Would you be
releasing the code for this version of the model in the future?
Could you also provide the specifications of the environment in
which you performed your experiments (with training times)? Our GPUs
are running out of memory when we try to run the old code with
pseudo-groundings turned on. It would also be helpful if you could
provide us with trained models (dumped into a pickle or even the final
.pth files in the workspace folder generated after a run of the old
code) or the final set of high-quality rules generated after training
RNNLogic on the datasets mentioned in the paper with optimal
hyperparameters.
We are looking forward to your response to the queries mentioned above.
Thanks in advance!
Regards,
Ananjan Nandi
Navdeep Kaur
The text was updated successfully, but these errors were encountered: