Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce overuse of ROBOT commands for efficiency #131

Open
anitacaron opened this issue Aug 21, 2024 · 4 comments · May be fixed by #135
Open

Reduce overuse of ROBOT commands for efficiency #131

anitacaron opened this issue Aug 21, 2024 · 4 comments · May be fixed by #135
Assignees
Labels
enhancement New feature or request

Comments

@anitacaron
Copy link
Collaborator

There's a function to generate a ROBOT command that has many commands in a chain, which may be unnecessary. It can be time-consuming and memory-consuming for large ontologies, e.g., NCBITaxon, PCL, or NCIT.

The chain commands are:

  1. merge
  2. measure
  3. remove (to create a base file, but in some cases, the input is already a base file)
  4. measure (on the base file, which the result is not being used)
  5. merge (output a base file, which can be the same as the input)

I would suggest removing, at least the second measure.

To illustrate, this is the current ROBOT command for NCBITaxon:

robot merge -i build/ontologies/ncbitaxon-raw.owl \
measure --prefix 'NCBITAXONALT: http://purl.obolibrary.org/obo/ncbitaxon#' \
--prefix 'COVOC: http://purl.obolibrary.org/obo/COVOC_' \
--prefix 'CIDO: http://purl.obolibrary.org/obo/CIDO_' \
--prefix 'dbpedia: http://dbpedia.org/resource/' \
--prefix 'EFO: http://www.ebi.ac.uk/efo/EFO_' \
--prefix 'ONTONEO: http://purl.bioontology.org/OntONeo/ONTONEO_' \
--metrics extended-reasoner -f yaml -o build/ontologies/ncbitaxon-metrics.yml \
remove --base-iri http://purl.obolibrary.org/obo/NCBITAXON_ \
--base-iri http://purl.obolibrary.org/obo/ncbitaxon# \
--base-iri http://purl.obolibrary.org/obo/NCBITaxon_ \
--axioms external --trim false -p false \
measure --prefix 'NCBITAXONALT: http://purl.obolibrary.org/obo/ncbitaxon#' \
--prefix 'COVOC: http://purl.obolibrary.org/obo/COVOC_' \
--prefix 'CIDO: http://purl.obolibrary.org/obo/CIDO_' \
--prefix 'dbpedia: http://dbpedia.org/resource/' \
--prefix 'EFO: http://www.ebi.ac.uk/efo/EFO_ \
--prefix 'ONTONEO: http://purl.bioontology.org/OntONeo/ONTONEO_' \
--metrics extended-reasoner -f yaml -o build/ontologies/ncbitaxon-metrics.yml.base.yml \
merge --output build/ontologies/ncbitaxon.owl

OBO-Dashboard/util/lib.py

Lines 343 to 377 in e39cde9

def robot_prepare_ontology(o_path, o_out_path, o_metrics_path, base_iris, make_base, robot_prefixes={}, robot_opts="-v"):
logging.info(f"Preparing {o_path} for dashboard.")
callstring = ['robot', 'merge', '-i', o_path]
if robot_opts:
callstring.append(f"{robot_opts}")
### Measure stuff
callstring.extend(['measure'])
for prefix in robot_prefixes:
callstring.extend(['--prefix', f"{prefix}: {robot_prefixes[prefix]}"])
callstring.extend(['--metrics', 'extended-reasoner','-f','yaml','-o',o_metrics_path])
## Extract base
if make_base:
callstring.extend(['remove'])
for s in base_iris:
callstring.extend(['--base-iri',s])
callstring.extend(["--axioms", "external", "--trim", "false", "-p", "false"])
### Measure stuff on base
callstring.extend(['measure'])
for prefix in robot_prefixes:
callstring.extend(['--prefix', f"{prefix}: {robot_prefixes[prefix]}"])
callstring.extend(['--metrics', 'extended-reasoner','-f','yaml','-o',f"{o_metrics_path}.base.yml"])
## Output
callstring.extend(['merge', '--output', o_out_path])
logging.info(callstring)
try:
check_call(callstring)
except Exception as e:
raise Exception(f"Preparing {o_path} for dashboard failed...", e)

@anitacaron anitacaron self-assigned this Aug 21, 2024
@matentzn
Copy link
Contributor

measure (on the base file, which the result is not being used)

I am wondering if this is a good thing - that it is not being used. Does it mean the file metrics we report in the dashboard all use the whole ontology?

@anitacaron
Copy link
Collaborator Author

Does it mean the file metrics we report in the dashboard all use the whole ontology?

Yes, only in cases where the base was generated.

@anitacaron
Copy link
Collaborator Author

The simple solution would be to change the order of the chain.

  1. merge
  2. remove (create the base file if make_base is true)
  3. measure
  4. merge

@matentzn
Copy link
Contributor

If the remove command is not run when the base file is available, then ok, I guess we can do that. Its a bit weird for some ontologies that dont have a base, like application ontologies, to show metrics of the base (think OMO), but I am not opposed to try this, and see how it looks!

@anitacaron anitacaron added the enhancement New feature or request label Sep 17, 2024
@anitacaron anitacaron linked a pull request Oct 11, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants