Reduce overuse of ROBOT commands for efficiency #131

anitacaron · 2024-08-21T14:08:48Z

There's a function to generate a ROBOT command that has many commands in a chain, which may be unnecessary. It can be time-consuming and memory-consuming for large ontologies, e.g., NCBITaxon, PCL, or NCIT.

The chain commands are:

merge
measure
remove (to create a base file, but in some cases, the input is already a base file)
measure (on the base file, which the result is not being used)
merge (output a base file, which can be the same as the input)

I would suggest removing, at least the second measure.

To illustrate, this is the current ROBOT command for NCBITaxon:

robot merge -i build/ontologies/ncbitaxon-raw.owl \
measure --prefix 'NCBITAXONALT: http://purl.obolibrary.org/obo/ncbitaxon#' \
--prefix 'COVOC: http://purl.obolibrary.org/obo/COVOC_' \
--prefix 'CIDO: http://purl.obolibrary.org/obo/CIDO_' \
--prefix 'dbpedia: http://dbpedia.org/resource/' \
--prefix 'EFO: http://www.ebi.ac.uk/efo/EFO_' \
--prefix 'ONTONEO: http://purl.bioontology.org/OntONeo/ONTONEO_' \
--metrics extended-reasoner -f yaml -o build/ontologies/ncbitaxon-metrics.yml \
remove --base-iri http://purl.obolibrary.org/obo/NCBITAXON_ \
--base-iri http://purl.obolibrary.org/obo/ncbitaxon# \
--base-iri http://purl.obolibrary.org/obo/NCBITaxon_ \
--axioms external --trim false -p false \
measure --prefix 'NCBITAXONALT: http://purl.obolibrary.org/obo/ncbitaxon#' \
--prefix 'COVOC: http://purl.obolibrary.org/obo/COVOC_' \
--prefix 'CIDO: http://purl.obolibrary.org/obo/CIDO_' \
--prefix 'dbpedia: http://dbpedia.org/resource/' \
--prefix 'EFO: http://www.ebi.ac.uk/efo/EFO_ \
--prefix 'ONTONEO: http://purl.bioontology.org/OntONeo/ONTONEO_' \
--metrics extended-reasoner -f yaml -o build/ontologies/ncbitaxon-metrics.yml.base.yml \
merge --output build/ontologies/ncbitaxon.owl

OBO-Dashboard/util/lib.py

Lines 343 to 377 in e39cde9

    
           def robot_prepare_ontology(o_path, o_out_path, o_metrics_path, base_iris, make_base, robot_prefixes={}, robot_opts="-v"): 
        
               logging.info(f"Preparing {o_path} for dashboard.") 
        
               callstring = ['robot', 'merge', '-i', o_path] 
        
               if robot_opts: 
        
                   callstring.append(f"{robot_opts}") 
        
               ### Measure stuff 
        
               callstring.extend(['measure']) 
        
               for prefix in robot_prefixes: 
        
                   callstring.extend(['--prefix', f"{prefix}: {robot_prefixes[prefix]}"]) 
        
               callstring.extend(['--metrics', 'extended-reasoner','-f','yaml','-o',o_metrics_path]) 
        
               ## Extract base 
        
               if make_base: 
        
                   callstring.extend(['remove']) 
        
                   for s in base_iris: 
        
                       callstring.extend(['--base-iri',s]) 
        
                   callstring.extend(["--axioms", "external", "--trim", "false", "-p", "false"]) 
        
               ### Measure stuff on base 
        
               callstring.extend(['measure']) 
        
               for prefix in robot_prefixes: 
        
                   callstring.extend(['--prefix', f"{prefix}: {robot_prefixes[prefix]}"]) 
        
               callstring.extend(['--metrics', 'extended-reasoner','-f','yaml','-o',f"{o_metrics_path}.base.yml"]) 
        
               ## Output 
        
               callstring.extend(['merge', '--output', o_out_path]) 
        
               logging.info(callstring) 
        
               try: 
        
                   check_call(callstring) 
        
               except Exception as e: 
        
                   raise Exception(f"Preparing {o_path} for dashboard failed...", e)

matentzn · 2024-08-21T14:16:32Z

measure (on the base file, which the result is not being used)

I am wondering if this is a good thing - that it is not being used. Does it mean the file metrics we report in the dashboard all use the whole ontology?

anitacaron · 2024-08-21T14:48:31Z

Does it mean the file metrics we report in the dashboard all use the whole ontology?

Yes, only in cases where the base was generated.

anitacaron · 2024-08-21T15:15:03Z

The simple solution would be to change the order of the chain.

merge
remove (create the base file if make_base is true)
measure
merge

matentzn · 2024-08-22T11:31:24Z

If the remove command is not run when the base file is available, then ok, I guess we can do that. Its a bit weird for some ontologies that dont have a base, like application ontologies, to show metrics of the base (think OMO), but I am not opposed to try this, and see how it looks!

anitacaron self-assigned this Aug 21, 2024

anitacaron added the enhancement New feature or request label Sep 17, 2024

anitacaron linked a pull request Oct 11, 2024 that will close this issue

Remove run ROBOT mesure on full ontology #135

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce overuse of ROBOT commands for efficiency #131

Reduce overuse of ROBOT commands for efficiency #131

anitacaron commented Aug 21, 2024

matentzn commented Aug 21, 2024

anitacaron commented Aug 21, 2024

anitacaron commented Aug 21, 2024

matentzn commented Aug 22, 2024

Reduce overuse of ROBOT commands for efficiency #131

Reduce overuse of ROBOT commands for efficiency #131

Comments

anitacaron commented Aug 21, 2024

matentzn commented Aug 21, 2024

anitacaron commented Aug 21, 2024

anitacaron commented Aug 21, 2024

matentzn commented Aug 22, 2024