Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can we best make HTS codes referenceable? #178

Closed
nissimsan opened this issue Jul 1, 2021 · 18 comments
Closed

How can we best make HTS codes referenceable? #178

nissimsan opened this issue Jul 1, 2021 · 18 comments
Assignees
Labels
externally-blocked Waiting for an standard not defined in this repository

Comments

@nissimsan
Copy link
Collaborator

nissimsan commented Jul 1, 2021

"HTS" codes is an American extension of the World Customs Organization's Harmonized System codes. Expressing commodity type is used for determining tariffs and associated documentation.

Here is a source of these HTS lists:
https://catalog.data.gov/dataset/harmonized-tariff-schedule-of-the-united-states-2019

This includes a json formatted file - I don't suppose that's sufficient for proper referencing? I am looking to indicate something like "htsno":"6204.11.00.00" on an import declaration.

My question applies to the "normal" HS codes too:
https://github.com/datasets/harmonized-system/blob/master/data/harmonized-system.csv

@TallTed
Copy link
Contributor

TallTed commented Jul 2, 2021

The best way would be to get the ITC, and more specifically, the Office of Tariff Affairs and Trade Agreements, to publish that data as Linked Data, with URIs properly minted for each relevant entity. This would not be very difficult, given the starting point of those documents you linked. (Most of the heavy lifting was already done, to produce those docs.)

They -- or even you, as the licensing is very permissible -- could do it in hours if not minutes with Virtuoso; other Linked Data servers could also be used, but I cannot speak to how easily or quickly the task could be done with such others.

If the ITC does it, they can mint permanent URIs under a domain/namespace they control. This would be optimal, all around.

@nissimsan
Copy link
Collaborator Author

Thanks @TallTed!

I completely agree, ITC would (should) ideally expose the HTS codes; and WCO expose HS codes. I'll see what I can do to convince them to do so.

One of my peers at UN/CEFACT (thanks @colugo) has produced the attached JSON-LD representations.

As a practical way to fill the void, might it make sense to host these lists here, though? Similar to how we also have the periodic table - could that be a model? That's also work great as an example of what we are requesting from ITC and WCO.

hs2017.jsonld.zip
hts2020.jsonld.zip

@TallTed
Copy link
Contributor

TallTed commented Jul 6, 2021

I would not recommend storing monolithic dump files in a change-management-system such as git nor a site based on the same such as github. I would strongly recommend against storing such monolithic files in some other organization's space within such system -- including here.

I've not looked into the JSON-LD ZIPs you posted here; it's all-but-certain that they are not based on official URIs/domains, without which they are of limited if any use. (If they are based in an official namespace, some of what follows may be implementable now, without further official assistance or involvement.)

Significantly, the periodic table doesn't change (except when the occasional new element is added). These HS/HTS data sets appear to change annually, if not faster, so whatever exercise is performed now will need to be repeated at least every year going forward.

During the Obama administration, data.gov made a great many Federal data sets public as live, linked data, accessible through SPARQL among other methods, hosted by a variety of back-ends initially stood up as proofs-of-concept and for comparative testing. Those efforts were stopped early in 2017, for reasons which are probably obvious. I think the current administration may recognize the value in reviving and advancing what was in place circa December 2016. The more voices pressing this idea, the more likely it will prevail -- so make your voice heard there, as well!

I believe the UN and/or some sub-agency/ies thereof has or had been working on similar initiatives with their data sets.

As to a practical way to fill the void, it would be best if the authoritative organizations could at least make known the (previously or newly) intended permanent URIs, whether or not they actually publish the relevant datasets there. Then someone willing and able to set up a long-lived deployment, with URIs within their own domain, could easily add owl:sameAs relations to their data set that relate the external URIs to the data.gov or other official agency URIs.

(Similarly, though it would add some work to the wished-for agency publication, some third party could publish the same datasets as SPARQL-accessible Linked Data, at long-lived URIs; the agency could then add the owl:sameAs relations to their publication --- but there would likely need to be some official assurance, and associated penalty for violation of such, that the third-party URIs would never be redefined to have any other meaning, semantic or otherwise.)

These publications can be locally mirrored by entities who need to make queries against them, through creative DNS and other techniques, and such mirrors can persist and be replicated into the future, whether the original publications remain up or not (though, of course, it is preferable that they remain up, such that the URIs of these entities can be dereferenced by any querent, not only those who have local mirrors).

Once official URIs for the HS/HTS entities are minted, those URIs may be used in perpetuity to refer to — to denote — those entities, whether or not those same URIs are ever dereferenceable. This would limit the utility of the HTTP-based "superkeys" — the URIs — of Linked Data in these data sets, but it would still mean that every other entity who needed to refer to the HS/HTS entities could use the same identifiers to do so, and thereby be unambiguous in their references.

There's much more to this subject, but neither this issue nor this repo is the best space for such discussion. The W3C's Semantic Web mailing list is likely to reach a more relevant and appropriate audience.

@nissimsan
Copy link
Collaborator Author

@TallTed, many thanks.

It's hard to disagree with you on this. Better to focus our efforts on the proper solution than an interim one.

I actually did file a request via https://www.data.gov/contact back when I raised this. I haven't heard back from them. But now noticed their github and raised this: GSA/data.gov#923
Let's see what/if they respond, and perhaps we can get an early indication out of them about what the URI will end up looking like, as you suggest.

I've been deeply involved in the UN work you are referring to: https://service.unece.org/trade/uncefact/vocabulary/uncefact/ This includes many code lists, but not HS nor HTS (which as WCO and US Gov respectively).

Thanks again!

@nissimsan
Copy link
Collaborator Author

@TallTed et al.,

I've sent [email protected] three emails now (ref GSA/data.gov#923) but have not heard anything back.

I'll keep trying, but don't see a reason to keep this ticket alive for something out of our control and may never happen. So I propose we close this issue and stick with the model we have:
https://w3c-ccg.github.io/traceability-vocab/#addProductCodeType

Any objections to closing?

@nissimsan
Copy link
Collaborator Author

Below is email from Jeremy Wise, a super proactive and helpful contact at USITC. I wasn't aware of the first option he is highlighting:
https://hts.usitc.gov/api/search?query=6204.11.00.00
... Any objections to this style of IRI?


From: Wise, Jeremy [email protected]
Date: Tue, Nov 2, 2021 at 2:57 PM
Subject: RE: Publishing HTS codes as Linked Data
To: Nis Jespersen [email protected]
Cc: [email protected] [email protected]
Good morning Nis,
Please forgive me but I may be missing something. How does the API ?Query parameter not meet the needs you described in the GitHub thread? We’ve deployed the HTS API in such a way that it qualifies as a Universal Resource Locator including ?query functionality. The results of such a link, wouldn’t they be the same as the results from a URI link?
E.g. https://hts.usitc.gov/api/search?query=6204.11.00.00 vs. https://hts.usitc.gov/api/search/6204.11.00.00
Thanks,
Jeremy

@TallTed
Copy link
Contributor

TallTed commented Nov 2, 2021

That could be the best we're going to get. I am somewhat disappointed in the of dereferencing that new URI --

$ curl -Lki "https://hts.usitc.gov/api/search?query=6204.11.00.00"
HTTP/1.1 200 200
Date: Tue, 02 Nov 2021 20:50:29 GMT
Server: Apache
Content-Type: application/json;charset=UTF-8
Transfer-Encoding: chunked
Strict-Transport-Security: max-age=31536000

{"results":[{"other":"58.5%","superior":null,"indent":"2","description":"Of wool or fine animal hair (444)","statisticalSuffix":"00","score":"237.50","special":"Free (AU,BH,CL,CO,IL,JO,KR,MA,OM,P,PA,PE,S,SG)","htsno":"6204.11.00.00","footnotes":[{"value":"See 9903.88.15. ","columns":["general"],"type":"endnote"}],"general":"14%","units":["No.","kg"]}]}tjtmpb:~ macted$ 

Note that it is plain JSON, not JSON-LD. Conneg for text/turtle or application/ld-json gets 406 Not Acceptable.

So it looks like the next step is to introduce Mr. Wise to some of the wonders of Linked Data, if not RDF, such that they at least add an @context to their JSON, and start their journey toward the 21st Century!

@nissimsan
Copy link
Collaborator Author

nissimsan commented Nov 3, 2021

Thanks @TallTed,
I agree, an @context would make this much easier to use. Raw https://hts.usitc.gov/api/search?query=6204.11.00.00 in the middle of a clean jsonld file is suboptimal. I'll solicit this feedback on the email chain I got going.
Still, this is considerably better than nothing, and I'm keen to close this issue.

@OR13
Copy link
Collaborator

OR13 commented Feb 15, 2022

@nissimsan can you give us an update on this issue?

@nissimsan
Copy link
Collaborator Author

I have nothing to report on this except passing of time. I did pass along the idea of adding @context (Nov 3rd 2021) but haven't heard anything more back from the process.
Jeremy Wise did mention that a program is under way to "update the HTS DMS over the next year", and our feedback has been injected into this process. I'll ask for a status update. Meanwhile let's keep this ticket open to track the HTS update progress.

@nissimsan
Copy link
Collaborator Author

Update from USITC: Updating the HTS DMS is underway. Schedule release of next version Fall 2022.

@nissimsan
Copy link
Collaborator Author

I will reach out when it's fall.

@BenjaminMoe BenjaminMoe added the externally-blocked Waiting for an standard not defined in this repository label Aug 2, 2022
@nissimsan
Copy link
Collaborator Author

It is now fall. I will reach out.

@nissimsan nissimsan self-assigned this Oct 18, 2022
@nissimsan
Copy link
Collaborator Author

Just followed up with Jeremy at USITC.

Expecting that he will follow the link to here too (but be precluded from responding) - hi Jeremy! 👋

@BenjaminMoe
Copy link
Contributor

@nissimsan any updates on this issue?

@OR13
Copy link
Collaborator

OR13 commented Aug 15, 2023

Suggest closing this issue, it's not actionable.

@TallTed
Copy link
Contributor

TallTed commented Aug 15, 2023

[email protected] is the contact... He doesn't appear to be a GitHub user (or at least, I can't find his handle)

@mkhraisha
Copy link
Collaborator

Closing as non-actionable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
externally-blocked Waiting for an standard not defined in this repository
Projects
None yet
Development

No branches or pull requests

5 participants