Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transformer matches can be too conservative #36

Open
RichJackson opened this issue Jun 10, 2024 · 0 comments
Open

Transformer matches can be too conservative #36

RichJackson opened this issue Jun 10, 2024 · 0 comments

Comments

@RichJackson
Copy link
Collaborator

Original comment from @EFord36

[kazu/steps/ner/opsin.py extendString() method reworks entity matches to account for Transformer model matches that tend to identify only a part of entities with longer names - which perhaps indicates need for more generic logic around handling transformer model matches]

@RichJackson my gut says maybe this logic should be wrapped up in the TransformersModelForToeknClassificationNerStep like we have the NonContiguousEntitySplitter, so we don't get the match wrong to start with, and need to then fix it here much later in the pipeline. It would also mean either having opsin used to some extent by all users, or having it be pretty flexibly configurable, or writing our own logic to decide when extending is reasonable.

What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant