Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add provenance property to track domain data is harvested from #84

Open
AlasdairGray opened this issue Nov 8, 2021 · 2 comments
Open
Labels
enhancement New feature or request

Comments

@AlasdairGray
Copy link
Member

From @egonw

[For] each graph [have] a domain triple giving just the domain, e.g. for a MassBank MolecularEntity the graph would have ?g somePredicate: "massbank.eu"

From @albangaignard

You could use

  • prov:wasDerivedFrom, but that needs an IRI as it is an object property
  • prov:atLocation with the URL of the location
@AlasdairGray
Copy link
Member Author

We could use either

  • dcterms:source: "A related resource from which the described resource is derived."
  • pav:providedBy: "The original provider of the encoded information (e.g. PubMed, UniProt, Science Commons)."

I think my preference would be pav:providedBy with the hostname as the value.

@AlasdairGray
Copy link
Member Author

Note that when datasets come from the same site, e.g. BioSamples and Chembl both provided by https://ebi.ac.uk/, then these will be seen as coming from the same site.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants