Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bot has a nasty habbit of talking in third person #50

Open
leahcornelius opened this issue Sep 10, 2020 · 1 comment
Open

Bot has a nasty habbit of talking in third person #50

leahcornelius opened this issue Sep 10, 2020 · 1 comment

Comments

@leahcornelius
Copy link

Hello all,
i would like to start by thanking the DialoGPT team for open sourcing this - it has greatly aided my work. I have been playing with the decoder provided by polakowo. I have been working on creating a long range memory solution for pretrained generative transformers and so have been getting my head arround the system It works well overall but often either repates what is ay untill i clear the chat history or it will talk about itself in thrid person. Here is an example of both.'
Third person issue:

Chloe, 
So, you're a cat?

Leo Cornelius,
lol no

Chloe, 
Well, I guess so.

Leo Cornelius, 
are you a cat?

Chloe, 
She shrugs

Leo Cornelius, 
xD

Chloe, 
She shrugs

Leo Cornelius,
Do you like snow?

Chloe, 
she shrugs

Leo Cornelius,
do you ever do anything other than shrug?

Chloe,
she shrugs

repeating issue:

Chloe,
It is, my son.

Leo Cornelius, 
your son?!

Chloe, 
Your son is my son

Leo Cornelius,
i have a son?

Chloe, 
Your son has a son?

Leo Cornelius, 
maybe...

Chloe, 
you have a son?

Leo Cornelius,
no

Chloe,
no you must

Leo Cornelius,
well i dont

Chloe, 
no you must

Leo Cornelius,
Do you have a son?

Chloe,
no son

Leo Cornelius, 
do you want one?

Chloe,
no he wants my son

Leo Cornelius, 
who?

Chloe, 
who wants a son

(apologies for the voluminous data)
I do understand that a well seen issue with transformers is that they tend to imitate the person they are talking to and it has been discussed in detail here (#45). I have tried these methods but all they achieve is stopping the bot from repeating the same thing over and over again (and result in the bot just modifying it slightly (i am they are, they want etc). Iw as wondering if you have experimented with nucleus filtering and have recommended top-k, top-p and temperature values? I am currently using 50, 0.95 and 0.7 (respectively).

Regarding the third person issue. Do you know of this issue? Any suggestions on fixing it? Would it be possible to modify the decoder to apply a penalty for talking in third person? IVe seen that there is a "repeating penalty" in the hugging face transformer decoder and was wondering if something similar (but for person) could be used?

Again, apologies for the voluminous data and the large issue. I thought putting t all in one isue would be better than lots of smaller ones).
Thanks, Leo Cornelius

@dreasysnail
Copy link
Contributor

Hi, thanks for your interest. For your reference, I usually use top-K, top-p and temperature as 10, 0.9 and 0.9. I haven't really compared among all configurations but this seems robust to me. For the third person issue, I personally have not seen too many generation cases talking in third person. The more prominent behavior, though, is that the model tend to use first person when greedy/beam decoding is employed. Have you used your own data to fine-tune the model? If so it may be a dataset bias. If that does not solve your problem, I think what you suggested makes sense -- penalizing the common triggering works for third person. I would be curious about whether this simple strategy helps:)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants