Use of Next Sentence Prediction loss #69

FallakAsad · 2019-08-15T10:48:15Z

According to Bert's architecture, the loss is calculated as the sum of the mean masked LM likelihood and the mean next sentence prediction likelihood. Does this implementation includes the next sentence prediction loss when calculating loss? Does the use of 'SEP' tag will have any effect on training loss?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use of Next Sentence Prediction loss #69

Use of Next Sentence Prediction loss #69

FallakAsad commented Aug 15, 2019

Use of Next Sentence Prediction loss #69

Use of Next Sentence Prediction loss #69

Comments

FallakAsad commented Aug 15, 2019