logical behind score() in bart_score.py #47

zwxu064 · 2024-12-09T03:49:19Z

Hi, I am wondering how is the logical reasoning of this model. If my source and target are
source: ["I have one sister and one brother"]
target: [["I have no siblings", "I have 2 siblings", "I have three siblings", "my parents have 3 kids"]]

the BARTScores: [[-3.561114549636841], [-3.4354922771453857], [-3.130859136581421], [-4.85980749130249]]

corresponding to the quality "I have three siblings" > "I have 2 siblings" > "I have no siblings" > "my parents have 3 kids".
Apparently, this is wrong.

Thanks.

yyy-Apple · 2024-12-13T02:07:41Z

I did reproduce this with the model trained on parabank2, altho the numbers are slightly different to yours. When changing "I have 2 siblings" to "I have two siblings", the model is able to score this one much higher than others. This suggests that the model hasn't seen many numerical values during its training process, making it unfamiliar with them.

Regarding the example "my parents have 3 kids", this requires a higher level of reasoning and cannot be considered a direct paraphrase of "I have one sister and one brother", even though they both refer to the same fact. Therefore the model may struggle on this.

zwxu064 · 2024-12-14T06:15:44Z

I agree, that means BARTScore is not intrinsically good at reasoning, even if the source context is given, the score for the target sentence sometimes has a discrepancy to the real semantic meaning, such as the semantic of 2 and two are the same, but only two is recognised because no Arabic numbers are used in the source context.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

logical behind score() in bart_score.py #47

logical behind score() in bart_score.py #47

zwxu064 commented Dec 9, 2024 •

edited

Loading

yyy-Apple commented Dec 13, 2024

zwxu064 commented Dec 14, 2024 •

edited

Loading

logical behind score() in bart_score.py #47

logical behind score() in bart_score.py #47

Comments

zwxu064 commented Dec 9, 2024 • edited Loading

yyy-Apple commented Dec 13, 2024

zwxu064 commented Dec 14, 2024 • edited Loading

zwxu064 commented Dec 9, 2024 •

edited

Loading

zwxu064 commented Dec 14, 2024 •

edited

Loading