Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explicitly populate feature dictionaries with all available features. #1019

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

eleftherioszisis
Copy link
Contributor

This is a PoC of registering all features that will be available by features.get beforehand, instead of implicitly delegating a neurite, morphology or population or collections thereof, to the respective feature when get is called.

The _NEURITE_FEATURES, _MORPHOLOGY_FEATURES, _POPULATION_FEATURES dictionaries are updated following registration so that features from downstream categories (e.g. neurite features) are transformed and registered to upstream categories (e.g. morphology & population feature dictionaries).

If a feature is already registered in upstream categories, because it is defined in the respective module, it is not overwritten and the module definition is used instead.

This PoC allows to define reducible features, i.e. features that can be used by higher order objects by applying them to their components (e.g. a neurite feature can be applied to the neurites of a morphology or population). If a feature is not reducible it should be defined in each module (neurite.py, morphology.py, population.py) so the special logic is used instead.

The advantage of this change is that the available features become explicit and populate the respective feature dictionaries. In this context, features.get only needs to check what type of object is passed and get the respective function from the dictionaries without any special logic.

@eleftherioszisis
Copy link
Contributor Author

See all available features as shown in the features.get help using this change: features.txt

@eleftherioszisis
Copy link
Contributor Author

@adrien-berchet , @lidakanari , @arnaudon , please feel free to take a look if you have the time. I am not adding you as reviewers yet because it's still a PoC that needs to be discussed.

@adrien-berchet
Copy link
Member

It looks nice!
I had a quick look and I am just wondering what happens in the current version for generators. Because with this PR, as far as I can see generators would not be caught by collections.abc.Sequence so it would fail, and I am wondering if it would be a breaking change.

@eleftherioszisis
Copy link
Contributor Author

It looks nice! I had a quick look and I am just wondering what happens in the current version for generators. Because with this PR, as far as I can see generators would not be caught by collections.abc.Sequence so it would fail, and I am wondering if it would be a breaking change.

Thanks for checking it out! In the master's implementation, generators are not supported either:

is_obj_list = isinstance(obj, (list, tuple))
if not isinstance(obj, (Neurite, Morphology, Population)) and not is_obj_list:
raise NeuroMError('Only Neurite, Morphology, Population or list, tuple of Neurite,'
' Morphology can be used for feature calculation')

Thus, I have not introduced a new behavior in this implementation.

Supporting generators is an interesting topic. Not sure if practical or if there are use cases that require them, but I believe with a bit of extra logic you could check the first element and return a chain of the first element with the rest of the generator to pass into the feature functions.

@adrien-berchet
Copy link
Member

Ok!
I wondered about generators because of the Population objects since their __iter__ method returns a generator. So I think we could have the following use case: we iterate over a population using the generator, we change something in the morphology and yield this new morphology on which we want to compute the feature. In this case we would need to support generators. But it's quite specific.

@eleftherioszisis
Copy link
Contributor Author

Ok! I wondered about generators because of the Population objects since their __iter__ method returns a generator. So I think we could have the following use case: we iterate over a population using the generator, we change something in the morphology and yield this new morphology on which we want to compute the feature. In this case we would need to support generators. But it's quite specific.

Certainly, we could address this in a separate issue.

Copy link
Contributor

@GianlucaFicarelli GianlucaFicarelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this PR the complexity is moved from _get_feature_value_and_func to _transform_downstream_features_to_upstream_feature_categories but overall it seems cleaner to me.
I'd like to completely remove the if isinstance chain, but it seems not possible if we need to keep the same API accepting a generic object.


from neurom.core import Population, Morphology, Neurite
from neurom.core.morphology import iter_neurites
from neurom.core.types import NeuriteType, tree_type_checker as is_type
from neurom.utils import flatten
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flatten and wraps are unused imports


def _flatten_feature(feature_value, feature_shape):
"""Flattens feature values. Applies for population features for backward compatibility."""
return feature_value if feature_shape == () else reduce(operator.concat, feature_value, [])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comes from the previous code, but wouldn't flatten/itertools.chain work?


def _get_neurites_feature_value(feature_, obj, kwargs):
"""Collects neurite feature values appropriately to feature's shape."""
kwargs = deepcopy(kwargs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the code can be simplified if the signature is changed to:

def _get_neurites_feature_value(feature_, obj, /, neurite_type=NeuriteType.all, **kwargs):

# Update the feature dictionaries so that features from lower categories are transformed and usable
# by upstream categories. For example, a neurite feature will be added to morphology and population
# feature dictionaries, transformed so that it works with the respective objects.
_transform_downstream_features_to_upstream_feature_categories(_FEATURE_CATEGORIES)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can try to move most of the content of this file to a different file, keeping here only the needed imports?

feature_(n, **kwargs) for n in iter_neurites(obj, filt=is_type(neurite_type))
)

return reduce(operator.add, per_neurite_values, 0 if feature_.shape == () else [])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems clearer to me something like that (and in general more performant):

Suggested change
return reduce(operator.add, per_neurite_values, 0 if feature_.shape == () else [])
func = sum if feature_.shape == () else flatten_to_list
return func(per_neurite_values)

where flatten_to_list(x) is list(flatten(x))

@adrien-berchet adrien-berchet added this to the v4 milestone Feb 22, 2023
@eleftherioszisis eleftherioszisis removed this from the v4 milestone May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants