Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 0.2.0 Dev Branch #48

Draft
wants to merge 11 commits into
base: master
Choose a base branch
from
Draft

Release 0.2.0 Dev Branch #48

wants to merge 11 commits into from

Conversation

ljvmiranda921
Copy link
Owner

@ljvmiranda921 ljvmiranda921 commented Jan 4, 2025

This is a development branch for the Release 0.20 models.
Resolves #47

  • Train models
  • Write changelog
  • Make a version release on GitHub
  • Update pointers in the main library.
  • Update HuggingFace names in the repository.

Current changelog

  • Included trainable lemmatizer in the pipeline: instead of a rules-based lemmatizer, we are now using the neural edit-tree lemmatizer.
  • Trained on UD-NewsCrawl: this is a major update, as we are now training our parser, tagger, and morphologizer components on the larger UD-NewsCrawl treebank. Our training dataset has now increased from 150+ to 15,000! From this point forward, we will be using the UD-TRG and UD-Ugnayan treebanks as test sets (as intended).
  • Better evaluations: Aside from evaluating our dependency parser and POS tagger on UD-TRG and UD-Ugnayan, we have also included Universal NER (Mayhew et al., 2023) as our test set for evaluating the NER component.
  • Improved base model for tl_calamancy_trf: Based on internal evaluations, we are now using mDeBERTa-v3 (base) as our source of context-sensitive vectors for tl_calamancy_trf.

@ljvmiranda921 ljvmiranda921 marked this pull request as draft January 4, 2025 20:39
@ljvmiranda921 ljvmiranda921 self-assigned this Jan 7, 2025
@ljvmiranda921 ljvmiranda921 force-pushed the master branch 4 times, most recently from bb90712 to 85cb149 Compare January 15, 2025 21:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update models for 0.2.0
1 participant