-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data shape based minimalistic modeling and validation #215
Comments
Hi @amiika ! This is good timing. @gugod and I have been working on a YS based schema language for the past few weeks. We call it SchemaYS. And it is indeed very compact and very powerful. https://github.com/ingydotnet/schemays-test/ is a demo repo. Schemas and types are just functions. https://github.com/ingydotnet/schemays-test/blob/main/classes.yaml#L5 The https://github.com/ingydotnet/schemays-test/blob/main/classes.yaml#L3 loads a library that defines the https://github.com/ingydotnet/schemays-test/blob/main/schema/class.ys is the library (schema) file where etc. I'll put this stuff into a proper schemays repo today or tomorrow. If you are interested I'd love your help bringing this together. Also I'll take a look at malli. TIL |
Looks great. There's a lot of approaches and different use cases for data modeling and validation. My example had a focus on defining and validating the structure of the object. Where as the class.ys example focuses on validating the value domain. Ideally minimal schema language would include both:
Im new to YAMLScript and Clojure, but got inspired by how you "bent" the rules of YAML to define operators for the YAMLScript. Squeezing more semantics into property names while still being valid YAML makes it relatively easy to parse the added semantics without writing totally custom parser. I'v been using YAML a lot, for example for defining OpenAPI specifications. One thing that one could take from that realm is the idea of multi-file definitions used by for example redocly. Maintaining large models gets much easier with ability to split the model to multiple files that can be versioned and controlled in some versioning system. For example JSON schema based model splitted to two yaml files:
I'll post another example later how this same data could be represented in more "data shape" or "data-driven "way. |
Hi again @amiika ! I saw this issue and responded late last night. There's so much to say here and this runs deep. In 2016 era I spent many months working on a similar ambition which I dubbed SchemaType. I've been thinking on this a long time. It was only recently that I realized (like you) that YS can make this easy and powerful. SchemaType was heavily influenced by OpenAPI, with the realization that validation was just the tip of the iceberg for well defined schemas and types. JSON schema imho offers so little for so much code. I converted your examples above here: https://github.com/ingydotnet/schemays-test/tree/main/215-a These are very weak schemas and I left them weak in my conversion. For example https://github.com/ingydotnet/schemays-test/blob/main/215-a/identifier.yaml#L9-L13 There's a lot to unwrap here. I think the best course is to make a schemays repo and start writing tests of what it should do. We should also write a jsonschema-to-schemays converter. That would guide us pretty far. I'm quite sure that json schema can be a full subset of schemays. I took a look at malli and was excited to see it was written in clojure. That can only be helpful. I hope you have a lot of questions. Don't hold back! |
Let's move continued discussion over to https://github.com/pkgys/schemays/discussions I'll leave this issue open for a while to possibly draw more attention to SchemaYS. |
@amiika Let me know if you can comment on pkgys/schemays#2 I'm new to using github discussions. I unlocked this issue so you could comment here if you are having problems. |
It would be cool to be able to define and validate data using yaml shapes & yamlscript. Sure one can define data using JSON schema structures and use that to validate YAML, but something more minimalistic and data-driven like malli schemas in YAML would be more concise and easier to maintain.
Something like:
This type of syntax could be parsed to build for example malli schema registry and utilize existing built-in schemas as much as possible:
The text was updated successfully, but these errors were encountered: