Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store SBML meta information (level, version, packages, provenance) #810

Closed
Midnighter opened this issue Feb 28, 2019 · 11 comments
Closed

Store SBML meta information (level, version, packages, provenance) #810

Midnighter opened this issue Feb 28, 2019 · 11 comments
Assignees
Labels
SBML Related to reading and writing SBML models.
Milestone

Comments

@Midnighter
Copy link
Member

Hey @opencobra/cobrapy-core,

I'd like to store some meta information about the SBML that was parsed:

  1. level
  2. version
  3. packages used (fbc, annotation, groups)

The big question is where to store that information and I'd like your opinions. Current ideas, either

  1. Create a new attribute on the model cobra.Model.sbml_info that could be a tuple (level: int, version: int, packages: Tuple[str]).
  2. Create a new cobra.Model.meta attribute which would allow for some more general information later. It could be a dictionary and model.meta["SBML"] could contain above information.

Curious what you think and if you have any other ideas about this 😃

@Midnighter Midnighter added the SBML Related to reading and writing SBML models. label Feb 28, 2019
@Midnighter Midnighter added this to the COBRApy 1.0 milestone Feb 28, 2019
@Midnighter Midnighter self-assigned this Feb 28, 2019
@ChristianLieven
Copy link
Contributor

Since it doesn't really belong to the cobra.Model object, I was thinking something along the lines of returning a tuple if a specific flag was set on the parsing function:

model = read_sbml("path/to/model.xml")
model, tuple = read_sbml("path/to/model.xml", sbml_info=True)

Would that work too?

@Midnighter
Copy link
Member Author

Yes, that's another consideration. In general, I think functions having varying return types are a pain in the neck and bad design and I'd like to avoid it in future but this may be an acceptable exception to the rule 😉

@kvikshaug
Copy link
Member

I agree it doesn't belong to cobra.Model since it might as well have been loaded from JSON. It should also be possible to extract this info from libsbml in cases where it's not able to build a model instance.

@gregmedlock
Copy link
Member

What about a helper function in io, e.g. cobra.io.read_sbml_info(), that does nothing but return the meta information?

@Midnighter
Copy link
Member Author

Some further requirements to take into account:

  1. Some of the meta information might be desirable when writing a model back to SBML. In that case the model is the only logical place where this information can be stored. @matthiaskoenig can give a better picture on this.
  2. Even though in case of the version information, which is right in the header, it is cheap to do, I think in general it's not very desirable to go back and parse information again.

@ChristianLieven
Copy link
Contributor

ChristianLieven commented Mar 1, 2019

  1. Some of the meta information might be desirable when writing a model back to SBML. In that case the model is the only logical place where this information can be stored. @matthiaskoenig can give a better picture on this.

In that case, I'd prefer the dictionary approach (2.) of your original post.

Would parsing it and then storing it in some sort of global variable that is detached from the model object itself be a solution that is less of a pain in the neck and bad design? Something comparable to a Click context?

Such that:

model = read_sbml("path/to/model.xml")

creates both model but also adds an entry to some sort of MODEL_REGISTRY dictionary that exists for this session:

MODEL_REGISTRY[model<Object 1238452>] = {meta: Information}

When any of the cobra.io functions then encode the model as SBML or JSON they could default to the information that makes sense for that filetype i.e. writing to SMBL would retrieve

'info': 'SBML L3V1, fbc-v2, groups-v1', 
'level': 3, 
'packages': {'fbc': 2, 'groups': 1}, 
'version': 1}

but writing to JSON wouldn't use that information.

@matthiaskoenig
Copy link
Contributor

matthiaskoenig commented Mar 1, 2019 via email

@cdiener
Copy link
Member

cdiener commented Mar 1, 2019

I would argue that information of this kind belongs to a cobra model since it specifies provenance. I agree that is should not be a SBML specific attribute though. First, cobra.Model already has an annotations dictionary which is not used for much right now and could get a provenance entry. Alternatively we could add cobra.Model.provenance which annotates how that model was obtained. For instance it could indicate the JSON schema version, the reconstruction method, etc. The SBML write can then pick which of that info it wants to use to write SBML. This also goes in line with what many workflow managers or other large projects (for instance Qiime 2) are doing.

@Midnighter
Copy link
Member Author

Could you link to an example or documentation that shows this for Qiime 2? I don't know it but your reasoning sounds convincing to me.

@cdiener
Copy link
Member

cdiener commented Mar 2, 2019

There is some argumentation in https://docs.qiime2.org/2019.1/concepts/?highlight=provenance#data-files-qiime-2-artifacts, namely

Artifacts enable QIIME 2 to track, in addition to the data itself, the provenance of how the data came to be. With an artifact’s provenance, you can trace back to all previous analyses that were run to produce the artifact, including the input data used at each step. This automatic, integrated, and decentralized provenance tracking of data enables a researcher to archive artifacts, or for example, send an artifact to a collaborator, with the ability to understand exactly how the artifact was created. This enables replicability and reproducibility of analyses, as well as generation of diagrams and text that can be used in the methods section of a paper. Provenance also supports and encourages the proper attribution to underlying tools (e.g. FastTree to build a phylogenetic tree) used to generate the artifact.

Most of Qiime still works via the command line, but you can look at an example for provenance in the web visualization (clicking on the provenance tab on top)

@matthiaskoenig matthiaskoenig changed the title Store SBML information Store SBML meta information (level, version, packages, provenance) Mar 5, 2019
@matthiaskoenig matthiaskoenig self-assigned this Mar 18, 2019
@akaviaLab akaviaLab mentioned this issue Jun 19, 2022
1 task
@cdiener
Copy link
Member

cdiener commented Nov 4, 2022

Now tracked in #1237 and available as part of the history.

@cdiener cdiener closed this as completed Nov 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
SBML Related to reading and writing SBML models.
Projects
None yet
Development

No branches or pull requests

6 participants