Update NER model with huggingface transformer #12914
-
I'm working on training a 'ner' (named entity recognition) model using the Hugging Face Now, I'm trying to continue to train the model with another dataset. From what I've learned (from here) , I need to keep the ['transformer'] layer frozen and just focus on updating the 'ner' part. I did this part too [from here] (#11547) I train the data with frozen transformer and it works fine, but now I want to update NER with another data-set that I have- in other words, I want to update NER regularly as soon I have new data-set to train it, image Should I update only data-sets and continue to train with config bellow or I have to make a pipeline again for each train set?
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hi @shahryary, for the training run with your data set In practice, the whole source = "./current_model" (Where Note that catastrophic forgetting might become an issue if you train with datasets with diverging distributions over time. There are some resources on this related to spaCy: e. g. a blog post, and several forum posts. Basically if you train on A, then B, then C, ... and if the data sets vary between them, at some point you might find that there model will start performing worse on the original data (e.g. data A). This happens because the weights of the NER model are being gradually overwritten by the later data sets. To remedy, at least continue tracking the performance of the newly trained models on the older datasets (test set of A, etc). If you notice regressions, consider mixing in data from the older datasets when training on a new one, just to ensure that signal keeps getting reinforced and backpropped. |
Beta Was this translation helpful? Give feedback.
Hi @shahryary, for the training run with your data set
A
this config is fine. If you just swap out the dataset for the second run, you'll have your model learn from scratch though. Instead, source both the transformer and the ner from the pipeline that was trained onA
. From then on you can use the same config as long as you always source from the latest model they trained.In practice, the whole
[components.ner]
would become just one line:source = "./current_model"
(Where
./current_model
is the location of the model trained with the previous dataset.)Note that catastrophic forgetting might become an issue if you train with datasets with diverging distributions over time. There are some…