Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/rule optimiser #6

Open
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

MartiPuigV
Copy link
Contributor

Optimises rules for miniscanner / datascanner entered by the user. Removes redundancies such and cpr AND cpr.

Comment on lines +224 to +238
def list_equality_no_order(arr1, arr2):
"""In python [1, 2, 3] == [3, 2, 1] equals False.
In our case, we just want to check if two containers
have the same components. This function detects any
elements that do not occur in both. If none are found,
they are identical."""
base = arr1
for k in arr2:
try:
base.remove(k)
except ValueError:
# Can't remove because it isn't there
return False
# Return not (is there anything left in base). True if empty else False
return not base
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You work around this by doing .copy() on the input argument in the one place you call this function, but an equality comparison that destroys one of its operands is a Bad Surprise.

(The logic here is also unnecessarily complicated -- you could just do sorted(arr1) == sorted(arr2).)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, sorting the list doesn't work, since it contains (at least can contain) CustomContainer classes, that can't be sorted. Hence my code. .copy() works though.

self.reformat_rule(OUT_PATH)


class CustomContainer():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you doing your own JSON parsing when the os2datascanner.engine2.rules package exists? 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didnt know about the rules package when doing this. Also, did you mention the wrong piece of code or am i missing something in the 4 quoted lines ?

src/os2datascanner/utils/optimiser/optimiser.py Outdated Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is an output file from your tool and doesn't belong in the source hierarchy. (This is also an example of the risks of having a hard-coded output path in the source hierarchy!)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, i wrote a comment about having this in a tmp directory and removing it once done. Is there a place for temporaryu resources, or do i just go to the resources folder and remove it after ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another file that shouldn't be here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as prior

return True
return False

def optimisation_cycle(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This optimisation pass looks vaguely plausible, but you haven't implemented any unit tests for it, so it's hard to see what it actually does...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe i don't see the same quoted code as you do or you made a mistake, or do you mean in general ?

Comment on lines +83 to +93
def check_useless(self, container:'CustomContainer'):
# Useless i.e. AND(1), OR(1), etc ...
if len(container.components) == 1:
try:
container.parent.components = arr_switch(container.parent.components, container, container.components[0])
self.containers.remove(container)
self.found_redundancy = True
except ValueError:
pass
return True
return False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The os2datascanner.engine2.rules package already handles this case. If you try to make a CompoundRule with a single component, it automatically decays to that component:

    @classmethod
    @abstractmethod
    def make(cls, *components):
        """Creates a new Rule that represents the combination of all of the
        given components. The result does not need to be an instance of @cls:
        it could be a completely different kind of Rule, or a simple boolean
        value, if this rule can already be reduced to one.

        Subclasses must override this method, but should call up to this
        implementation."""
        if len(components) == 0:
            raise ValueError("CompoundRule with zero components")
        elif len(components) == 1:
            return components[0]
        else:
            return cls(*components)

That is, we already implicitly do this optimisation when we execute the rule.

(That's not to say necessarily that you shouldn't do it elsewhere too, but what if this was deliberate? What if you define a rule with an AndRule to leave yourself a hook to add other things later?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your point here is that this test is useless, because the rule_builder does it before compiling it and sending to be checked at the scanner ? If so i can just delete my check ?

@MartiPuigV
Copy link
Contributor Author

It seems that the optimiser only works with JSON one-liners ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants