Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a new class ParameterStorage to let classes like BaseStation inherit from #801

Merged
merged 11 commits into from
Jan 27, 2025

Conversation

fschlueter
Copy link
Collaborator

@fschlueter fschlueter commented Jan 22, 2025

Okay, this PR does more than I initially intended. It implements a new base class ParameterStorage to reduce a lot of Boilerplate code. However, this new class brings also an additional big benefit: If we want to add additional parameters to a class (for example ARIANNAParameters) we can do so now for many experiments (SKA, RNO-G, ...) without exploding code!

This PR is not yet finished, I first want to see if you like the idea (@sjoerd-bouma, @cg-laser) before investing more time. Things to do:

  • Apply it to all classes which use parameter storages
  • Adjust the serialize and deserialize functions
  • Create deprecation warnings for the get_ARIANNA_parameter() interface

…example of the base_station. Move decorator to new class
@fschlueter fschlueter changed the title Add a decorator which checks the key enum typ for the parameter stora… Implement a new class ParameterStorage to let classes like BaseStation inherit from Jan 22, 2025
@fschlueter
Copy link
Collaborator Author

Or, with the new class it would also be easy to allow keys from two (or more) different parameter types to be added to one parameter storage. I made a quick test that this might work (although we have to check if that still works serializing/deserializing

from NuRadioReco.framework.parameters import stationParameters as stp, channelParameters as chp

test_dict = {stp.nu_zenith: 1, chp.zenith: 2}. # both enums have the numerical value 1
print(test_dict)

@fschlueter
Copy link
Collaborator Author

Sjoerd made the name of this branch really meaningless :D

@sjoerd-bouma
Copy link
Collaborator

Other comment - I'm wondering if the module parameter_storage should be private as we probably don't want users to initialize this class. (I'm not sure if that would mess up the documentation as classes generally also link to their parent class). The thinking being that with the large amount of modules/classes already in NuRadioMC we should try to avoid exposing potentially confusing 'internal' classes.

self._parameters.pop(key, None)

def get_parameters(self):
return self._parameters
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is indeed how it was implemented previously, but this looks dangerous to me - shouldn't we return a deepcopy? I think it makes more sense for this to be read-only, but let me know if you disagree.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the conclusion is to leave it as is for now?

@fschlueter
Copy link
Collaborator Author

Okay, lets see if the CI is happy with me. I have implemented the new parameter storage into all necessary classes. And I am now allowing parameters from several parameter types to be stored in one parameter storage. This will allow to add more experiment specific data into these parameter storages without duplicating more and more code. This means also that the parameter type is now also stored in the files. An example for this is the Event class which stores now eventParameters and generatorAttributes in the same parameter storage.

@fschlueter
Copy link
Collaborator Author

@cg-laser can you approve this? This would in come handy to continue with the RNO-G data processing branch

Copy link
Collaborator

@sjoerd-bouma sjoerd-bouma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just left some minor comments/questions, but I won't be able to approve before Monday, and anyway I don't think I'm impartial on this (I'm a big fan!), so maybe it's worth someone else (@cg-laser or @anelles) having a brief look at this as well.

data = {
"_parameters": parameters,
"_parameter_covariances": parameter_covariances,
"_parameter_types": [parameter_type.__name__ for parameter_type in self._parameter_types]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you storing which types of parameters are accepted by a class? This is already defined in the init (I actually think it should be defined before the init, but this is nitpicky) of each class, and I think this should have precedence. But maybe I'm just missing something.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well with the PR, we are adding the possibility to store parameters of different types in the same storage (dict). For one class this is currently done in the init function (event with eventParameters and generatorAttributes). However, we allow to add more possible types via a function which should be set in modules/user code. The idea is that for example in the mattak/RNO-G data reader we would do something like station.add_paramter_type(stationParameterRNOG). For those additional parameter types defined at run time we need to store them.

Comment on lines +79 to +83
if 'parameters' in data:
data['_parameters'] = data['parameters']

if 'parameter_covariances' in data:
data['_parameter_covariances'] = data['parameter_covariances']
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why you are now using _parameters to store the parameters, instead of parameters as previously? Would just changing it to parameters mean you could get rid of this backwards-compatibility check?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first class I touched used _paramters I thought all did but realised only later that several (maybe more) used parameter. So we need this anyway...

@@ -1,26 +1,20 @@
from __future__ import absolute_import, division, print_function
import NuRadioReco.framework.parameters as parameters
import NuRadioReco.framework.parameter_serialization
from NuRadioReco.framework.parameters import particleParameters as paP
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a huge fan of obscure abbreviations of classes anyway, and paP looks pretty ugly to me... but I'm aware this is not the only place in the code where this occurs, and particleParameters.vertex_times is indeed very long. Won't reject the PR over this but I don't like it ;)

logger = logging.getLogger('NuRadioReco.framework.parameter_storage')


class ParameterStorage:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still curious if it's worth making this module private, because this module should only be used internally, so this somewhat 'clutters' the documentation. I can test this on Monday.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a good idea but maybe for another PR?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You cannot change it here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have already merge this branch into other branches so I would really like doing this in another PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of merging public API and then making it private after, but I guess we can live with a lower standard of backwards-compatibility on develop.

self._parameters.pop(key, None)

def get_parameters(self):
return self._parameters
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the conclusion is to leave it as is for now?

Copy link
Collaborator

@cg-laser cg-laser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great Felix.

The only downside I can see is that you will break usercode that used get_ARIANNA_parameter or get_generator_attributes. However this will be an easy fix.
We could think about just adding a deprecation warning instead of fully removing the functions, but as the userbase is still smallish, I'm ok with just deleting the functions.
But we should make this clear in the changelog.

Copy link
Collaborator

@anelles anelles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, very nice change and it used to be cumbersome indeed.

logger = logging.getLogger('NuRadioReco.framework.parameter_storage')


class ParameterStorage:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You cannot change it here?

@fschlueter
Copy link
Collaborator Author

This is great Felix.

The only downside I can see is that you will break usercode that used get_ARIANNA_parameter or get_generator_attributes. However this will be an easy fix. We could think about just adding a deprecation warning instead of fully removing the functions, but as the userbase is still smallish, I'm ok with just deleting the functions. But we should make this clear in the changelog.

@cg-laser, I have added deprecation warnings in both cases. In case of get_generator_attributes it will issue a warning an than return the return self.get_parameter(key). In case of get_ARIANNA_parameter just an error is raised saying one should use get_parameter

@fschlueter fschlueter merged commit dc94118 into develop Jan 27, 2025
5 checks passed
Comment on lines +60 to +61
def get_parameters(self):
return copy.copy(self._parameters)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have thought you need a deepcopy (e.g. if the parameter value of one of the parameters is a numpy array, or a dict, or...)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I agree a deepcopy is probably what you want. I will open a new PR

@fschlueter
Copy link
Collaborator Author

In PR #807 I am making the class private and fix the copy statement

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants