Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor codemeta validation code #43

Closed
wants to merge 1 commit into from

Conversation

hjonin
Copy link
Contributor

@hjonin hjonin commented May 7, 2024

Relates to #29.

This PR tries to improve the codemeta validation code, especially removing some hacks (e.g. coming from multi-version handling, like https://github.com/codemeta/codemeta-generator/pull/34/files/c2d17bd25f6924ea4a5fdab50a72250773b47d15#r1574302506).

After reviewing what's been done in other libraries (e.g. codemetar, eossr, caltech), I've noticed that:

  1. There was no equivalent in terms of codemeta field by field validation.
  2. These libraries validated the JSON version or the compacted version, or the framed version in the case of codemetar.

Thus, validating the expanded version did not seem like the right way to do it contrary to what was stated in #29, as it would introduce a lot of complexity to handle the complex structure of that version.

However, what I could think of to answer #29, was to compact the codemeta document into our internal context (which contains all the fields from all the versions, already used for multi-version generation) so that it becomes possible to validate a multi-version document without having to deal with any specific case.

@hjonin hjonin closed this May 7, 2024
@hjonin hjonin deleted the feat/gh-29-improve-validation branch May 7, 2024 08:15
@hjonin hjonin restored the feat/gh-29-improve-validation branch May 7, 2024 08:15
@hjonin hjonin reopened this May 7, 2024
@progval
Copy link
Member

progval commented May 7, 2024

What is the reason for moving functions from things.js to validation.js and utils.js? My reasoning for the classification is that things.js is for processing anything that derives from https://schema.org/Thing while validation.js is generic JSON-LD processing

Thus, validating the expanded version did not seem like the right way to do it contrary to what was stated in #29, as it would introduce a lot of complexity to handle the complex structure of that version.

We need to handle that complex structure (value in object in array) either way, because that structure is also valid in compacted documents. The reason for expanding is that when working with the expanded version then documents only use that structure instead of allowing multiple variants (value, value in array, value in object, value in object in array)

if (validator === undefined) {
// TODO: find if it's a field that belongs to another type,
// and suggest that to the user
setError(`Unknown field "${fieldName}" in "${parentFieldName}".`)
setError(`Unknown field "${compactedFieldName}".`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd really like to keep the in <parent field> part, it's very useful to debug when you have a document like this:

{
    "description": "Foo",
    "author": [
        {
            "description": "Bar"
        }
    ]
}

so the user understands which instance of "description" is invalid.

Comment on lines +161 to +165
function switchCodemetaContext(codemetaJSON, contextUrl) {
const previousCodemetaContext = codemetaJSON["@context"];
codemetaJSON["@context"] = contextUrl;
return previousCodemetaContext;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of altering the argument and returning the previous context; could you return a new object with the new context, and not change the argument? I think it would be less confusing, and removes the need to switch it back.

@hjonin
Copy link
Contributor Author

hjonin commented May 7, 2024

What is the reason for moving functions from things.js to validation.js and utils.js? My reasoning for the classification is that things.js is for processing anything that derives from https://schema.org/Thing while validation.js is generic JSON-LD processing

The reason for moving code is for better readability. IMHO, the functions moved to utils.js are indeed utility functions, and the code moved to validation/index.js are constants that could be defined at the upper level, and that are different from the business code contained in things.js.

We need to handle that complex structure (value in object in array) either way, because that structure is also valid in compacted documents. The reason for expanding is that when working with the expanded version then documents only use that structure instead of allowing multiple variants (value, value in array, value in object, value in object in array)

This PR might not fit the initial use case then, but I feel like I'm not comfortable enough with jsonld to be able to validate the expanded version (I feel like I will handle objects and arrays and @value and @id and @list quite randomly). Feel free to decline this PR if you think it is not relevant (the major improvement here was to remove the hacky function https://github.com/codemeta/codemeta-generator/pull/43/files#diff-c5ee0172133a6a0e15eab88dddc7e0dc353cfb634cf3747726763df20343c492L31)!

@progval
Copy link
Member

progval commented May 7, 2024

the code moved to validation/index.js are constants that could be defined at the upper level, and that are different from the business code contained in things.js.

I don't see the reason to make a distinction, describing the type of each field is as much business code as the rest.

I feel like I will handle objects and arrays and @value and @id and @list quite randomly

We need to be able to handle them anyway, because they can also show up in the compacted version.

Which one to expect should be precisely described in https://github.com/codemeta/codemeta/blob/master/codemeta.jsonld :

  • if "@container": "@list" then it should be [{"@list": [...]}]
  • if "@type": "@id" then it should be [{"@id": ...}, ...] or a list of objects (eg. Person or Organization)
  • otherwise it should be [{"@value": ...}, ...]

@hjonin hjonin marked this pull request as draft May 8, 2024 15:02
@hjonin
Copy link
Contributor Author

hjonin commented Oct 3, 2024

Not applicable anymore.

@hjonin hjonin closed this Oct 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants