You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I read the paper and have a question on how to assign a reward to the extractor? The reasoner gets the reward 1 if it reaches the correct target entity, and the intermediate reward is 0. You mention that the extractor receives the reward from the reasoner in the step-wise. But how can reasoner give the extractor reward in each of the steps, since the reasoner can only get the reward in the end-step?
The text was updated successfully, but these errors were encountered:
I read the paper and have a question on how to assign a reward to the extractor? The reasoner gets the reward 1 if it reaches the correct target entity, and the intermediate reward is 0. You mention that the extractor receives the reward from the reasoner in the step-wise. But how can reasoner give the extractor reward in each of the steps, since the reasoner can only get the reward in the end-step?
The text was updated successfully, but these errors were encountered: