Open Access Reference corpus #756
Labels
documentation
More documentation is needed or there are errors in the documentation
help wanted
Extra attention is needed
The current guidelines for new synsets, state that the lemma must have at least 100 occurrences in Sketch Engines's TenTen corpus.
https://github.com/globalwordnet/english-wordnet/blob/master/NEW_SYNSETS.md
This corpus is only accessible to paying Sketch Engine customers and so does not really fit with our open-source goals. We should update this to an open access corpus such as the American National Corpus.
Any suggestions?
The text was updated successfully, but these errors were encountered: