Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Best usage of Keyterms (Proper Names) #653

Open
johnml1135 opened this issue Feb 11, 2025 · 0 comments
Open

Best usage of Keyterms (Proper Names) #653

johnml1135 opened this issue Feb 11, 2025 · 0 comments

Comments

@johnml1135
Copy link
Collaborator

This issue is to research the best usage of Proper Names (and keyterms more broadly) into the AI drafting flow and determine the best way to use it. Currently, all Proper Names (type PN) are included in training with the same weights, no matter how much or little other training data is being used. A few points of research could be:

A better Keyterm metric

We need to be able to easily determine the effectiveness of different means of adding keyterms and how well they are used. These metrics should account for:

  • That the proper keyterms are used in the proper places
  • That the Bleu score has not been degraded
  • If possible, accounting for different surface forms of the word

This this metric could assessed as well as Bleu and called Keyterm Accuracy. It would be:

  • KeytermAccuracy = (The number of times the keyterm occurs in the correct verse the correct number of times) / (The total number of instances of keyterms)
  • Multiples of the keyterms when there should not be multiples of the keyterm should penalize the metric
  • A Levenshtein distance should be applied if the word does not appear and if one of "sufficient closeness" appears, then it should be considered a proper match.
  • If there are multiple translations of the same term, any one of the proper translations occurring in the verse should be counted as a proper match.

Potential improvements:

Ways to improve the usage of Keyterms without degrading the Bleu and Chrf++ score include:

  • Guided decoding and similar methods: Guided Decoding #178
  • "Enhanced" guided decoding: Guided decoding with "enhanced" keyterms #652
  • Only including the PN's for the books we are inferencing off of
  • Only including the PN's for the books we are inferencing off of that are not already in the training data
  • Including more than just PN's - can we include more terms? Does it make it better?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant