Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

frac:corpus? #6

Open
chiarcos opened this issue May 11, 2022 · 2 comments
Open

frac:corpus? #6

chiarcos opened this issue May 11, 2022 · 2 comments

Comments

@chiarcos
Copy link
Contributor

chiarcos commented May 11, 2022

(a) it has been recently suggested to merge frac:corpus and frac:locus into a single property. how should this be named? originally, that was dc:source.
(b) at the moment, frac:corpus is not obligatory for a frac:Observation. Can we assert that exactly one corpus is required?

@chiarcos
Copy link
Contributor Author

Consensus now is to provide both frac:corpus and frac:locus for attestations, the former pointing to the source data as a whole, the second pointing to the specific location.

@chiarcos
Copy link
Contributor Author

An open (recurring) discussion point is the naming of frac:corpus and frac:Corpus, because it led to misunderstandings in the past. In the current draft, it is explicitly and repeatedly stated that our understanding of corpus is not limited to NLP corpora[1], but this seems to be hard to communicate. An alternative solution is to abandon the notion of frac:Corpus, and instead operate with "members of the dct:DCMIType class (see https://www.dublincore.org/specifications/dublin-core/dcmi-terms/#section-7). Then, frac:corpus can be safely superseded by dct:source, and the type of source is made clear by the DCMIType member (dcmit:Collection, dcmit:Dataset, dcmit:Text, dcmit:Image, dcmit:MovingImage, ...).

[1] definition frac:Corpus: "represents any type of linguistic data or collection thereof, in structured or unstructured format.", definiton frac:corpus: "the data in which that Observation has been made. This can be, for example, a corpus or a text represented by its access URL, a book represented by its bibliographical metadata, etc.", Notes: "non-empty collection of texts, in electronic or other form. (Note that a single text can constitute a corpus.)"

chiarcos added a commit that referenced this issue Jul 6, 2023
revise Frequency section to close #5, #6
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant