Refactor codemeta validation code #43

hjonin · 2024-05-07T08:14:02Z

Relates to #29.

This PR tries to improve the codemeta validation code, especially removing some hacks (e.g. coming from multi-version handling, like https://github.com/codemeta/codemeta-generator/pull/34/files/c2d17bd25f6924ea4a5fdab50a72250773b47d15#r1574302506).

After reviewing what's been done in other libraries (e.g. codemetar, eossr, caltech), I've noticed that:

There was no equivalent in terms of codemeta field by field validation.
These libraries validated the JSON version or the compacted version, or the framed version in the case of codemetar.

Thus, validating the expanded version did not seem like the right way to do it contrary to what was stated in #29, as it would introduce a lot of complexity to handle the complex structure of that version.

However, what I could think of to answer #29, was to compact the codemeta document into our internal context (which contains all the fields from all the versions, already used for multi-version generation) so that it becomes possible to validate a multi-version document without having to deal with any specific case.

progval · 2024-05-07T10:47:45Z

What is the reason for moving functions from things.js to validation.js and utils.js? My reasoning for the classification is that things.js is for processing anything that derives from https://schema.org/Thing while validation.js is generic JSON-LD processing

Thus, validating the expanded version did not seem like the right way to do it contrary to what was stated in #29, as it would introduce a lot of complexity to handle the complex structure of that version.

We need to handle that complex structure (value in object in array) either way, because that structure is also valid in compacted documents. The reason for expanding is that when working with the expanded version then documents only use that structure instead of allowing multiple variants (value, value in array, value in object, value in object in array)

progval · 2024-05-07T10:37:46Z

js/validation/things.js

                    if (validator === undefined) {
                        // TODO: find if it's a field that belongs to another type,
                        // and suggest that to the user
-                        setError(`Unknown field "${fieldName}" in "${parentFieldName}".`)
+                        setError(`Unknown field "${compactedFieldName}".`)


I'd really like to keep the in <parent field> part, it's very useful to debug when you have a document like this:

{ "description": "Foo", "author": [ { "description": "Bar" } ] }

so the user understands which instance of "description" is invalid.

progval · 2024-05-07T10:40:44Z

js/validation/index.js

+function switchCodemetaContext(codemetaJSON, contextUrl) {
+    const previousCodemetaContext = codemetaJSON["@context"];
+    codemetaJSON["@context"] = contextUrl;
+    return previousCodemetaContext;
+}


Instead of altering the argument and returning the previous context; could you return a new object with the new context, and not change the argument? I think it would be less confusing, and removes the need to switch it back.

hjonin · 2024-05-07T11:54:43Z

What is the reason for moving functions from things.js to validation.js and utils.js? My reasoning for the classification is that things.js is for processing anything that derives from https://schema.org/Thing while validation.js is generic JSON-LD processing

The reason for moving code is for better readability. IMHO, the functions moved to utils.js are indeed utility functions, and the code moved to validation/index.js are constants that could be defined at the upper level, and that are different from the business code contained in things.js.

We need to handle that complex structure (value in object in array) either way, because that structure is also valid in compacted documents. The reason for expanding is that when working with the expanded version then documents only use that structure instead of allowing multiple variants (value, value in array, value in object, value in object in array)

This PR might not fit the initial use case then, but I feel like I'm not comfortable enough with jsonld to be able to validate the expanded version (I feel like I will handle objects and arrays and @value and @id and @list quite randomly). Feel free to decline this PR if you think it is not relevant (the major improvement here was to remove the hacky function https://github.com/codemeta/codemeta-generator/pull/43/files#diff-c5ee0172133a6a0e15eab88dddc7e0dc353cfb634cf3747726763df20343c492L31)!

progval · 2024-05-07T12:03:01Z

the code moved to validation/index.js are constants that could be defined at the upper level, and that are different from the business code contained in things.js.

I don't see the reason to make a distinction, describing the type of each field is as much business code as the rest.

I feel like I will handle objects and arrays and @value and @id and @list quite randomly

We need to be able to handle them anyway, because they can also show up in the compacted version.

Which one to expect should be precisely described in https://github.com/codemeta/codemeta/blob/master/codemeta.jsonld :

if "@container": "@list" then it should be [{"@list": [...]}]
if "@type": "@id" then it should be [{"@id": ...}, ...] or a list of objects (eg. Person or Organization)
otherwise it should be [{"@value": ...}, ...]

hjonin · 2024-10-03T09:33:22Z

Not applicable anymore.

Refactor codemeta validation code

a4bd522

hjonin closed this May 7, 2024

hjonin deleted the feat/gh-29-improve-validation branch May 7, 2024 08:15

hjonin restored the feat/gh-29-improve-validation branch May 7, 2024 08:15

hjonin reopened this May 7, 2024

progval reviewed May 7, 2024

View reviewed changes

hjonin marked this pull request as draft May 8, 2024 15:02

hjonin closed this Oct 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor codemeta validation code #43

Refactor codemeta validation code #43

hjonin commented May 7, 2024 •

edited

Loading

progval commented May 7, 2024

progval May 7, 2024

progval May 7, 2024

hjonin commented May 7, 2024 •

edited

Loading

progval commented May 7, 2024 •

edited

Loading

hjonin commented Oct 3, 2024

Refactor codemeta validation code #43

Refactor codemeta validation code #43

Conversation

hjonin commented May 7, 2024 • edited Loading

progval commented May 7, 2024

progval May 7, 2024

Choose a reason for hiding this comment

progval May 7, 2024

Choose a reason for hiding this comment

hjonin commented May 7, 2024 • edited Loading

progval commented May 7, 2024 • edited Loading

hjonin commented Oct 3, 2024

hjonin commented May 7, 2024 •

edited

Loading

hjonin commented May 7, 2024 •

edited

Loading

progval commented May 7, 2024 •

edited

Loading