Replies: 1 comment
-
I note that this question is a duplicate of issue #756. There is not a single reference corpus that is used, although CoCA is our preferred corpus for most cases. Most of our terms were created by the Princeton team for Princeton WordNet, including the synset you highlighted. I am not sure of exactly where they got the source of their botany names, although this synset looks quite similar to A Dictionary of American Plant Names, but does not match modern botany sources. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I encountered the word: "gosmore" that I had never heard before. (Although I acknowledge that the majority of words in WN I have not heard before.) But although it was present in WN, Collins online and Merriam Webster online, it does not exist in Wiktionary. It made me wonder how words in WN are selected for inclusion. I have read in one of the linked papers that a word requires at least 100 examples in the corpus to be included.
How is the corpus for OEWN constructed? How often is it updated and when is the search and tagging operation run? (There's probably industry standard terms for this that I don't know yet. But once you have assembled a corpus, there must be some software process that runs to extract unique words and occurrences of those words, etc.)
Btw, the Google N-gram for "gosmore" is interesting. It almost looks like a EKG heartbeat printout!
Thanks,
Beta Was this translation helpful? Give feedback.
All reactions