Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce roles for files related to calculations #523

Open
wants to merge 6 commits into
base: develop
Choose a base branch
from

Conversation

merkys
Copy link
Member

@merkys merkys commented Jun 12, 2024

During workshop discussions, @giovannipizzi suggested introducing a means to specify how files are related to calculations entries. This PR introduces meta.role field to specify whether an attached file is input or output relative to the calculation in question.

Furthermore, there was a suggestion to order the related output files in a way to make it easier for automated analysis software to parse the output files in order to identify the code and/or calculation type used. Since it is impossible to define the deterministic orders for all the used codes, the suggestion is to use an arbitrary order where output files most likely to contain the identifying information come first. This way analysis software would encounter the identifying output files earlier and stop before reading in all the output files.

blokhin
blokhin previously approved these changes Jun 12, 2024
optimade.rst Outdated Show resolved Hide resolved
optimade.rst Outdated Show resolved Hide resolved
optimade.rst Outdated Show resolved Hide resolved
Co-authored-by: Antanas Vaitkus <[email protected]>
merkys and others added 2 commits June 12, 2024 16:19
Co-authored-by: Antanas Vaitkus <[email protected]>
Co-authored-by: Antanas Vaitkus <[email protected]>
@rartino
Copy link
Contributor

rartino commented Jun 12, 2024

My first reaction is: doesn't this belong in metadata data rather than as data metadata? In the sense calculations records the provenance of something, the "inputs", the "outputs" and the "process/workflow" seems the core data in such an entry?

Edit: I noticed that I had swapped metadata and data in my comment. Sorry for sounding confused.

@merkys
Copy link
Member Author

merkys commented Jun 12, 2024

My first reaction is: doesn't this belong in metadata rather than as data? In the sense calculations records the provenance of something, the "inputs", the "outputs" and the "process/workflow" seems the core data in such an entry?

My suggestion is to define the inputs/outputs division in metadata of relationships. Since we use them (relationships) to refer to other entries, then I think meta of relationships is the most appropriate place to store this metadata.

@rartino
Copy link
Contributor

rartino commented Jun 12, 2024

JSON:API's meta of relationships is not something that exists in OPTIMADE outside of the JSON:API response format, so we have to define this "place" where to store this machine readable role in section 3.7. And, if we do that, why not encode that in the relationship name in JSON:API, instead of down inside the meta object?

From a "least surprise" perspective, it makes more sense to me that there is an "input" and an "output" relationship that contains the relationship linkage to inputs and outputs.

@merkys
Copy link
Member Author

merkys commented Jun 13, 2024

We have discussed this proposal during the workshop and identified a drawback: there is no way to query on relationship metadata, nor now, nor after #524 gets merged in. A possible solution was put forward to use named relationships (now relationships are named after entry types), but this would possibly:

  1. interfere with the common property-entrytype namespace
  2. break the nice attribute which guarantees that relationships under the same name are of the same entry type

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants