-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial validate support, with --strict option #142
Changes from 18 commits
ef746b1
ace592a
7d26157
145b938
4888698
a20c8aa
7520081
cbd4ee6
f0683a6
d093cda
33298bd
4c28024
b388a3a
674ebb7
7b3ab42
e380451
d301229
c37e116
c1ed016
9ccebe0
e3abc33
5fef946
3ad5315
5fcfe94
6774922
2bbce18
96961bd
c03b0a0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,3 +5,5 @@ var | |
build | ||
dist/ | ||
target/ | ||
*.DS_Store | ||
*/.DS_Store |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,6 +6,7 @@ channels: | |
dependencies: | ||
- flake8 | ||
- ipython | ||
- jsonschema | ||
- mypy | ||
- omero-py | ||
- pip | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,9 +8,12 @@ | |
import dask.array as da | ||
import numpy as np | ||
from dask import delayed | ||
from jsonschema import validate as jsonschema_validate | ||
from jsonschema.validators import validator_for | ||
from ome_ngff.schemas import LocalRefResolver, get_schema | ||
|
||
from .axes import Axes | ||
from .format import format_from_version | ||
from .format import CurrentFormat, format_from_version | ||
from .io import ZarrLocation | ||
from .types import JSONDict | ||
|
||
|
@@ -106,6 +109,12 @@ def load(self, spec_type: Type["Spec"]) -> Optional["Spec"]: | |
return spec | ||
return None | ||
|
||
def validate(self, warnings: bool) -> None: | ||
# Validation for a node is delegated to each spec | ||
# e.g. Labels may have spec for multiscales and labels | ||
for spec in self.specs: | ||
spec.validate(warnings) | ||
|
||
def add( | ||
self, | ||
zarr: ZarrLocation, | ||
|
@@ -177,6 +186,10 @@ def __init__(self, node: Node) -> None: | |
def lookup(self, key: str, default: Any) -> Any: | ||
return self.zarr.root_attrs.get(key, default) | ||
|
||
def validate(self, warnings: bool = False) -> None: | ||
# If not implemented, ignore for now | ||
pass | ||
|
||
|
||
class Labels(Spec): | ||
"""Relatively small specification for the well-known "labels" group which only | ||
|
@@ -324,6 +337,30 @@ def array(self, resolution: str, version: str) -> da.core.Array: | |
# data.shape is (t, c, z, y, x) by convention | ||
return self.zarr.load(resolution) | ||
|
||
def validate(self, warnings: bool = False) -> None: | ||
multiscales = self.lookup("multiscales", []) | ||
version = multiscales[0].get("version", CurrentFormat().version) | ||
LOGGER.info("Validating Multiscales spec at: %s" % self.zarr) | ||
LOGGER.info("Using Multiscales schema version: %s" % version) | ||
image_schema = get_schema(version) | ||
|
||
# Always do a validation with the MUST rules | ||
# Will throw ValidationException if it fails | ||
json_data = self.zarr.root_attrs | ||
jsonschema_validate(instance=json_data, schema=image_schema) | ||
|
||
# If we're also checking for SHOULD rules, | ||
# we want to iterate all errors and show as "Warnings" | ||
if warnings: | ||
strict_schema = get_schema(version, strict=True) | ||
cls = validator_for(strict_schema) | ||
cls.check_schema(strict_schema) | ||
# Use our local resolver subclass to resolve local documents | ||
localResolver = LocalRefResolver.from_schema(strict_schema) | ||
validator = cls(strict_schema, resolver=localResolver) | ||
for error in validator.iter_errors(json_data): | ||
LOGGER.warn(error.message) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. sorry, not familiar with the code/design pattern here, can't see if there is some MVC like pattern where the warnings&errors (the Model) are collected/returned by some Here dandi/dandi-cli#943 (comment) we are to "converge" on some data structure which we would use to cover validation reports across multiple validators possibly used for any specific file/dataset at hands. So it would be important for us to be able to collect errors/warnings/hints uniformly first and then report them via our own "viewer". There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for your feedback. This work is at quite an early stage, so we haven't considered the usage of this tool outside of a cli validation. In fact, with changes to the schemas at https://github.com/ome/ngff/tree/main/0.4/schemas (include regular and "strict" schemas), there should be less logic required to validate. |
||
|
||
|
||
class OMERO(Spec): | ||
@staticmethod | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the role of this argument to trigger the validation of the recommended attributes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. We have various terminology: There are Spec rules that
SHOULD
be followed (so it'srecommended
that you do). These are covered by astrict schema
and if you don't then this gives youwarnings
(if you use that flag)!It would be nice to settle on ONE term that could cover all these cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Trying to look into the terminology used by others, I came back to SHACL which we have not decided to adopt at this stage but we might want to come back to in the near future.
A relevant concept is the idea of severity defined here and used by the validation report. In particular, looking at the pySHACL library, the
--allow-warnings
option effectively controls the severity level beyond which the result will be considered as invalid.