Skip to content

Training a relation extraction model with span categorization - MemoryError #12974

Discussion options

You must be logged in to vote

I think you're running into this problem because the spancat component is initially randomly initialized (untrained) and can produce nonsense, like annotating every single n-gram as an entity, which overwhelms the following relation extraction component.

Instead, try training spancat separately first until its performance is reasonably good, and then use source to include the tok2vec and spancat in the relation extraction config, similar to this example: https://spacy.io/usage/training#annotating-components. Using tok2vec, you'll need to include both tok2vec and spancat in the annotating components in this pipeline.

You can experiment with whether it works better to continue training spancat

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by Racana
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
training Training and updating models feat / spancat Feature: Span Categorizer feat / rel Feature: Relation Extractor
2 participants