feat: add hci converter #23

sabinem · 2025-02-14T05:21:39Z

This PR adds the HCI converter together with tests:

An example for the expected HCI input data is provided at examples/0-HCI.json

The commands for running the converters was changed:

run synth-converter with just run-synth examples/1-Synth.json examples/1-Synth.ttl
run hci-converter with just run-hci examples/0-HCI.json examples/0-HCI.ttl
run tests with just test

Note: We are not checking enums yet and leave that to the Shacl Validation: such as

   sh:property [sh:path allo-qual:AFQ_0000111 ;
                sh:xone ([ sh:hasValue "Solid"]
                        [sh:hasValue "Liquid"]) ; ] ; #state of matter / dispense state

We only check them where they are of importance for the transformation such as to identify the action type in a batch. The validation of the values is currently left to the Shacl Validation, which makes the parsing a bit lighter and is okay from my perspective.

TODOs

Fix inconsistency in Ontology: for Campaign type was missing, but will be fixed soon.
In onotology repo: the Chemical had strings for numerical values in the HCI example. In the Synth example this was already fixed. I assumed in the PR that we can fix that also for the HCI output, but this needs to be confirmed (https://github.com/sdsc-ordes/cat-plus-ontology/blob/2e8cc1a2706108e089c763ec2ed736ea79df90c8/json-file/0-HCI/0-HCI-Batch_definition.json#L29)

cmdoret

Nice! Got a few questions, but no major issue 👍

justfile

src/catplus-common/src/graph/prefix_map.rs

src/catplus-common/src/models/types.rs

src/hci-converter/src/convert.rs

vancauwe

Very nice work !
It's also nice to see that the catplus-common fits in nicely.
I have one suggestion that would be nice-to-have for subsequent edits, but I can also take care of the small change during Agilent parser.

src/catplus-common/src/graph/namespaces/cat.rs

sabinem · 2025-02-17T18:39:44Z

@cmdoret Thanks a lot for your review and most of your point make a lot of sense. As a general rule, I think it would be good to not add commits on a PR that you review, since that makes it harder to continue to work on it. At least in case you add a commit you should also point that out in a comment. I just noticed this when I tried to push my changes and had to first put them on another branch in order to pull the changed upstream first. It is okay for now, just maybe for next time.

sabinem · 2025-02-18T06:31:12Z

@vancauwe , @cmdoret With these changes it does not make sense to have a synth converter and an hci converter anymore: I will make just one metadata-converter that takes an extra argument synth or hci, since the difference between the two is now just one or two lines of code: it is just the name of the struct it parses.

cmdoret · 2025-02-18T12:09:02Z

@sabinem indeed sorry about that, I will set up formatting and CI to at least notify and check that the PR is formatted before accepting merges (without creating commits).

cmdoret

Looks very good to me! I only have a minor suggestion and a nitpick which we can keep for a later PR.
Feel free to merge :)

cmdoret · 2025-02-18T13:23:29Z

justfile

-    cargo run --bin hci-converter "{{root_dir}}/{{input_file}}" "{{root_dir}}/{{output_file}}" {{args}}
+run input_type input_file output_file *args:
+    cd "{{root_dir}}/src/converter" && \
+    cargo run --bin converter "{{input_type}}" "{{root_dir}}/{{input_file}}" "{{root_dir}}/{{output_file}}" {{args}}


Suggested change

cargo run --bin converter "{{input_type}}" "{{root_dir}}/{{input_file}}" "{{root_dir}}/{{output_file}}" {{args}}

cargo run --bin converter "{{input_type}}" "{{input_file}}" "{{output_file}}" {{args}}

suggestion: the command is already executed from the root dir, I don't think we need to prefix individual arguments. Removing those prefixes also allows to specify external files via absolute path.

@cmdoret I tried this but it did not work for me.

The current command is like this:

run input_type input_file output_file *args: cd "{{root_dir}}/src/converter" && \ cargo run --bin converter "{{input_type}}" "{{root_dir}}/{{input_file}}" "{{root_dir}}/{{output_file}}" {{args}}

It works from everywhere in the projects directory tree and you first go down to {{root_dir}}/src/converter and then it searches from there if you don't add {{root_dir}} again. So I leave it as it is now.

cmdoret · 2025-02-18T13:37:31Z

src/converter/src/main.rs

praise: the CLI is much better like this, I think this is a good design that we can keep building on.

cmdoret · 2025-02-18T13:39:40Z

src/converter/src/convert.rs

+        "jsonld" => {
+            graph_builder.serialize_to_jsonld().context("Failed to serialize to JSON-LD")?
+        }
+        _ => graph_builder.serialize_to_turtle().context("Failed to serialize to Turtle")?,


comment: I would expect "nt" to return ntriples, but it would silently return ttl.
nitpick: match "turtle" to the proper serializer and panic or return an error in the default case.

cmdoret · 2025-02-18T13:47:17Z

src/catplus-common/src/graph/prefix_map.rs

+macro_rules! ns_entries_direct {  // For rdf and xsd
+    ($msg:expr, $($ns:ident),*) => {
+        vec![
+            $(
+                (stringify!($ns), $ns.get("").expect(&$msg)),
+            )*
+        ]
+    };
+}
+


praise: smart solution! looks like a good use of macros 👍

sabinem · 2025-02-19T08:18:54Z

@cmdoret I followed up on your suggestions. Please dev review again.

the hci converter parses the hci file and ignores the campaign warpper but stores `hasCampaign` struct as `cat:Campaign`

* add terms to existing namespace for Campaign, Batch and Objective * add namespaces allohdf and allocom

* update batch: map all properties to graph * add Objective * complete properties for Campaign

Actions are optional on batches as for the hci parser there are no actions for batches

* provide example input data for HCI Parser

as the converter now converts both synth and hci metadata

That way it is detected in case the format is not recognized.

sabinem mentioned this pull request Feb 14, 2025

Feat/hci converter #21

Closed

sabinem requested review from cmdoret and vancauwe February 14, 2025 05:33

cmdoret assigned sabinem Feb 14, 2025

cmdoret changed the title ~~Feat/add hci converter~~ feat: add hci converter Feb 14, 2025

cmdoret reviewed Feb 14, 2025

View reviewed changes

justfile Outdated Show resolved Hide resolved

src/catplus-common/src/graph/prefix_map.rs Outdated Show resolved Hide resolved

src/catplus-common/src/models/types.rs Show resolved Hide resolved

src/hci-converter/src/convert.rs Outdated Show resolved Hide resolved

vancauwe approved these changes Feb 17, 2025

View reviewed changes

src/catplus-common/src/graph/namespaces/cat.rs Show resolved Hide resolved

cmdoret approved these changes Feb 18, 2025

View reviewed changes

sabinem and others added 18 commits February 19, 2025 10:21

refactor: rename synth converter command in just file

4d968ba

feat: add hci-converter to package

a5c9511

feat: add hci-converter on just file

834a7cd

feat: add Cargo.toml for hci converter

8aa31e5

feat: add campaign and campaign wrapper to catplus common types

2087e65

feat: add code files to hci converter

2dc8c11

the hci converter parses the hci file and ignores the campaign warpper but stores `hasCampaign` struct as `cat:Campaign`

test: add test to hci wrapper

3a5650c

feat: update namespaces for hci-converter

edb398f

* add terms to existing namespace for Campaign, Batch and Objective * add namespaces allohdf and allocom

feat: update types

15116ff

* update batch: map all properties to graph * add Objective * complete properties for Campaign

test: add hci-converter tests

69f83d7

fix: avoid infinite loop when adding actions

6950c30

Actions are optional on batches as for the hci parser there are no actions for batches

fix: update .gitignore to allow adding json examples

ed78787

* provide example input data for HCI Parser

format: cargo fmt

5f39e80

chore: sort namespace alphabetically

3d05018

feat: implement into_graph for campaign_wrapper

da96d18

refactor: improve prefix map by using macros

434f269

refactor: unify hci and synth parser to a shared parser

7136c91

tests: adapt tests for unified synth and hci converter

7807bdc

sabinem added 7 commits February 19, 2025 10:21

refactor: change name from synth-converter to converter

6d9c260

as the converter now converts both synth and hci metadata

refactor: adapt cargo.toml to name change of converters

1665e22

refactor: delete unused hci-converter

1394fca

refactor: name change from synth-converter to converter

0c42273

style: applying just fmt

23a1ba9

chore: change comment in justfile

1fbe4a8

fix: add enum for serialization format

5f69d07

That way it is detected in case the format is not recognized.

sabinem force-pushed the feat/add-hci-converter branch from b110af1 to 5f69d07 Compare February 19, 2025 09:24

sabinem added 3 commits February 19, 2025 10:39

tests: fix tests after change of format to enum

91f5193

chore: update README

336c823

style: apply cargo fmt

0a3c81d

sabinem merged commit 7481178 into main Feb 19, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add hci converter #23

feat: add hci converter #23

sabinem commented Feb 14, 2025 •

edited

Loading

cmdoret left a comment

vancauwe left a comment

sabinem commented Feb 17, 2025

sabinem commented Feb 18, 2025

cmdoret commented Feb 18, 2025 •

edited

Loading

cmdoret left a comment •

edited

Loading

cmdoret Feb 18, 2025

sabinem Feb 18, 2025

cmdoret Feb 18, 2025

cmdoret Feb 18, 2025

cmdoret Feb 18, 2025

sabinem commented Feb 19, 2025 •

edited

Loading

	cargo run --bin converter "{{input_type}}" "{{root_dir}}/{{input_file}}" "{{root_dir}}/{{output_file}}" {{args}}
	cargo run --bin converter "{{input_type}}" "{{input_file}}" "{{output_file}}" {{args}}

feat: add hci converter #23

feat: add hci converter #23

Conversation

sabinem commented Feb 14, 2025 • edited Loading

cmdoret left a comment

Choose a reason for hiding this comment

vancauwe left a comment

Choose a reason for hiding this comment

sabinem commented Feb 17, 2025

sabinem commented Feb 18, 2025

cmdoret commented Feb 18, 2025 • edited Loading

cmdoret left a comment • edited Loading

Choose a reason for hiding this comment

cmdoret Feb 18, 2025

Choose a reason for hiding this comment

sabinem Feb 18, 2025

Choose a reason for hiding this comment

cmdoret Feb 18, 2025

Choose a reason for hiding this comment

cmdoret Feb 18, 2025

Choose a reason for hiding this comment

cmdoret Feb 18, 2025

Choose a reason for hiding this comment

sabinem commented Feb 19, 2025 • edited Loading

sabinem commented Feb 14, 2025 •

edited

Loading

cmdoret commented Feb 18, 2025 •

edited

Loading

cmdoret left a comment •

edited

Loading

sabinem commented Feb 19, 2025 •

edited

Loading