You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(My apologies for originally posting a similar question under Issues, which I have since closed)
I am trying to train a grammatical error correction model with transformers, such that the inputs are English sentences, some of which have grammatical errors, and the output is a corrected version of each sentence if it contains errors, or simply the same sentence as the input if there are none. The problem I'm facing is that the greedy strategy in the decoder is producing outputs which are completely different in terms of word tokens and vocabulary from the inputs, since the n+1 word prediction is based on the _n_th word prediction, and prediction errors multiply significantly as the output sentence is generated one word at a time.
It seems that I would want to base the _n_th prediction on the _n_th input token, not the decoder's previous prediction. Could someone please point me to how I can disable the greedy decoder in this case? I have looked at sequence_decoders.py and recurrent_modules.py but I am not sure what to modify exactly. Many, MANY, thanks in advance!!
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
(My apologies for originally posting a similar question under Issues, which I have since closed)
I am trying to train a grammatical error correction model with transformers, such that the inputs are English sentences, some of which have grammatical errors, and the output is a corrected version of each sentence if it contains errors, or simply the same sentence as the input if there are none. The problem I'm facing is that the greedy strategy in the decoder is producing outputs which are completely different in terms of word tokens and vocabulary from the inputs, since the n+1 word prediction is based on the _n_th word prediction, and prediction errors multiply significantly as the output sentence is generated one word at a time.
It seems that I would want to base the _n_th prediction on the _n_th input token, not the decoder's previous prediction. Could someone please point me to how I can disable the greedy decoder in this case? I have looked at
sequence_decoders.py
andrecurrent_modules.py
but I am not sure what to modify exactly. Many, MANY, thanks in advance!!Beta Was this translation helpful? Give feedback.
All reactions