Could QUDT vocab entries be generated dynamically? #123

dr-shorthair · 2020-05-18T00:26:27Z

I see a lot of careful work going on to clean up the current cache of QUDT individuals, particularly in the /vocab/unit/ tree. This is good and enhances the credibility of the QUDT service. However, it is ultimately a never ending task, as the complete set of individual units of measure is essentially infinite, when you consider all the potential combinations of all the terminals and their variants in the different systems. Perhaps another approach is warranted - generate new individuals algorithmically.

For example a query could specify the dimension and system-of-units, or the UCUM symbol*, and then the QUDT service could return the QUDT representation and a URI for it. This might come from a static cache (which is what you are currently constructing) but if not found there it could be built on-the-fly.

*I mention the UCUM symbol since for the main tree of derived units, the UCUM symbol is both unique and its structure actually defines the production of the uom.

@stuchalk what do you think?

steveraysteveray · 2020-05-19T16:51:27Z

Simon,
Could you expand on your suggestion? Are you:
a) Suggesting using an algorithmic way to augment our statically coded vocabularies, or
b) Suggesting that we don't worry about maintaining a static vocabulary, and just create instances as needed?

Option a) would be nice, although we have found that there is a lot of judgment needed when creating new entries, not to mention properties like conversion multipliers, choices to be made when handling dimensionless units, and more.
Option b) sounds risky to me, for all the above reasons, plus the risk of things changing over time where one would not be as certain about what the URI was at some earlier time.

We (with a collaborator) took a run at a) for the IEC units, which is why our unit count popped up from <1000 to >1500 in the past couple of months.

dr-shorthair · 2020-05-19T23:40:55Z

I suggest that the static vocabulary should be considered a cache, but that an API could generate new units algorithmically. The UCUM website has some java code to do this - see https://unitsofmeasure.org/trac#ImplementationSupport
Now you have explained the IEC project it is clear that you are also considering this approach.
I'm certainly not proposing to discard the existing vocabulary. But routine new units, which are merely combinations of the existing terminals, can be done, as demonstrated in UCUM.

I've grabbed the main UCUM materials and stashed it away in the /community/ branch, just in case. See #124

steveraysteveray · 2020-05-20T00:12:28Z

The idea of generating routine combinations from existing terminals sounds intriguing. An obvious example is the generation of prefixed units (Giga, Mega, etc.). Is there a specific project you know of in the ImplementationSupport link you provided?

stuchalk · 2020-05-20T00:13:58Z

I think there definitely a need and to have units developed on the fly. I am going to work on building some prototype stuff for the CIPM Digital -SI and will keep this in mind as a very general (but important) use case.

dr-shorthair · 2020-05-20T00:24:01Z

Looks like there is active code development going on here: https://github.com/lhncbc/ucum-lhc

And this service provides an insight into what the required functionality shoudl be: https://ucum.nlm.nih.gov/ucum-service.html

There is an older project here: https://code.google.com/archive/p/unitsofmeasure/source/default/source

steveraysteveray · 2020-05-20T17:18:46Z

Thanks. The nih work could be very useful for some automated validation of QUDT... I'm putting this issue on our project page.

VladimirAlexiev · 2021-03-31T12:15:15Z

@dr-shorthair +1 an excellent idea though initially sounded a bit AI-ish to me.
That will add to QUDT a major benefit of UCUM/LINDT: infinite on-demand extensibility (w3c/sparql-dev#129 (comment))

@steveraysteveray could you elaborate on your IEC project? eg https://cdd.iec.ch/cdd/iec61360/cdddev.nsf/ListsOfUnitsAllVersions/0112-2---62720%23UAD106?opendocument is IEC quantity "mass density" and its units. It's a huge list but has no dimensionality, conversion factors, etc. Did you use some internal source that's not exposed on the web?

steveraysteveray · 2021-03-31T17:46:04Z

I believe we used publicly available sources for the IEC61360 codes. I will defer to @jhodgesatmb for the specifics of what was done, as he reviewed the work of our collaborator.

jhodgesatmb · 2021-03-31T18:21:57Z

IEC had a public UI and search facility that allowed us to download their units I believe as comma separated files and we imported them using TBCME and then edited them into the QUDT vocabularies.

…

Sent from my iPad

On Mar 31, 2021, at 10:46 AM, steveraysteveray ***@***.***> wrote: I believe we used publicly available sources for the IEC61360 codes. I will defer to @jhodgesatmb for the specifics of what was done, as he reviewed the work of our collaborator. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

steveraysteveray added the enhancement label May 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Could QUDT vocab entries be generated dynamically? #123

Could QUDT vocab entries be generated dynamically? #123

dr-shorthair commented May 18, 2020

steveraysteveray commented May 19, 2020

dr-shorthair commented May 19, 2020

steveraysteveray commented May 20, 2020

stuchalk commented May 20, 2020

dr-shorthair commented May 20, 2020

steveraysteveray commented May 20, 2020

VladimirAlexiev commented Mar 31, 2021

steveraysteveray commented Mar 31, 2021

jhodgesatmb commented Mar 31, 2021 via email

Could QUDT vocab entries be generated dynamically? #123

Could QUDT vocab entries be generated dynamically? #123

Comments

dr-shorthair commented May 18, 2020

steveraysteveray commented May 19, 2020

dr-shorthair commented May 19, 2020

steveraysteveray commented May 20, 2020

stuchalk commented May 20, 2020

dr-shorthair commented May 20, 2020

steveraysteveray commented May 20, 2020

VladimirAlexiev commented Mar 31, 2021

steveraysteveray commented Mar 31, 2021

jhodgesatmb commented Mar 31, 2021 via email