Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New OpenMensa model classes and use of alternative website for Aachen #70

Closed
wants to merge 60 commits into from

Conversation

j-maas
Copy link
Collaborator

@j-maas j-maas commented Feb 6, 2018

This PR unfortunately does two things. If you want to, I can spend the time to separate it into two distinct PRs.

OpenMensa Model

The major change concerns the addition of new model classes. The are found in the root in the openmensa_model module. They contain classes for a Canteen, Day (and DayClosed), Category, Meal, PriceWithRoles and Role, which each handle their own XML generation.

This allowed me to break down the Aachener parser into more concise functions, that each return a model object for e. g. a Meal and then aggregate them into a Category, then a Day etc.

I additionally created a model just for the Aachener parser, because the Aachener menu differs slightly form the OpenMensa model. E. g., the price is fixed for a category, not per meal. But it is straight forward to convert this Aachener model into the OpenMensa variant in the end.

I believe that this approach to generating the XML is easier to maintain than the centralized XML generator in PyOpenMensa's BaseBuilder, since it breaks down the XML into the data classes. It also helps structure the parsers' code, since it doesn't require them to input all the meals information in one single line, but encourages splitting up the parsing to the individual components.

Aachener Alternative Website

Instead of the list menu, I modified Aachen to parse from the table menu. The table is more stable, has less glitches like duplicate entries and weird formatting.

However, currently the table for the Academica does not include certain dishes, namely "Express", "Pizza Classics" and "Ofenkartoffel". This might break for some users, and thus warrants a discussion.
I've already asked the canteen's administration per mail for help with the online menu's inconsistencies, but have not received a reply, yet.

Y0hy0h added 30 commits January 14, 2018 11:23
The regression tests that were in the package are now factored out of it.
Copy link
Collaborator

@klemens klemens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a complete review, just some thoughts on the model. I haven't looked at the parser.

I am not yet sure if I like the proposed model. But while I think it is good that the new model requires the explicit use of the various helpers from pyopenmensa.feed, it will currently happily generate an invalid feed.

Also this code should probably live in pyopenmensa instead.

This PR unfortunately does two things. If you want to, I can spend the time to separate it into two distinct PRs.

While that would be nice, the bigger problem is that you based your changes on your other pull request. You should rebase the changes onto master, using something like: git rebase --onto mswart/master y0hy0h/regression-test (untested)

.gitignore Outdated
@@ -1,3 +1,9 @@
__pycache__
build

# Python Virtual Environment
venv/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also add your ignores to .git/info/exclude.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, learned something very interesting!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tried it out, and of course it works perfectly. But I'm thinking about whether it is generally a good idea to just put them in the public .gitignore. Because there is a (slight) chance that others have similar setups and then they don't forget to add it to their .git/info/exclude and it won't get into the repo.

But if you'd prefer me to it in .git/info/exclude in this case, just tell me. ;) What about the .idea/ in there as well?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't use a virtual environment for this project, but if I would I wouldn't name it venv. 😉

I think only build artifacts and maybe config files that everyone has to use should be in .gitignore. Everything else should be added locally.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that sounds fair. I will do that!

return '<{}: {}>'.format(self.__class__.__name__, self.__dict__)

def __eq__(self, other):
return isinstance(other, self.__class__) and self.__dict__ == other.__dict__
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these function actually needed (in all classes)? You don't seem to store these objects in maps or sets.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do store the Categorys in a Counter, which needs to know when two Categorys and consquently Meals are equal. The __repr__ was for debugging purposes.

I would have liked to use data classes, but I haven't found a way to use them without Pyhotn 3.7.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this is specific for your model. You even use your own Aachen.Category, so this could be added there.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I agree that it is possible to leave it out, do those functions do harm in any way? I'd even argue it's better to implement rich comparisons and representations for custom classes.

I'm honestly just not really understanding why you want to remove them. Please tell me what you think about this!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think they cause any harm. They just felt redundant and distracting while I read through the model. If you want to keep them, I am also fine with that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally understand, now. :) But except data classes in Python 3.7 there is no elegant solution, I fear...

I'd like the model to be easily usable by anyone, and those methods help with that, in my opinion.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are not class specific and similar in most classes. So it might be better to implement it in a shared meta class, parent class or a decorator.

return isinstance(other, self.__class__) and self.__dict__ == other.__dict__


class DayClosed:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it really make sense to differentiate between a day without any meals and a closed day?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The XML has this distinction, so I did the same. I was admittedly struggling with how to model a closed day, but I prefer this explicit object over a flag. But I'm open to suggestions!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but could DayClosed (ClosedDay might be more readable) be a subtype of Day that eg raises on append (or does nothing)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm finding it very difficult in Python to construct a clear hierarchy. If we let DayClosed inherit from Day, we will have to override the methods as you said. Both having it inherit and not feel not completely right to me.

My argument for separating them and not using inheritance is that I want to make their difference very explicit. (And implementation inheritance is bad.)

The more I think about your name suggestion, the more I agree. :D I will change the name to ClosedDay.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I thought about it again, and you are right, there isn't really any advantage.

I usually use statically types languages, so I automatically "think in types", but the python list doesn't really care which type it's elements have, so I guess this is fine.

price.text = price_format.format(role_price / 100)
else:
price = ET.SubElement(meal_element, 'price')
price.text = price_format.format(self.price / 100)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generating prices should be part of the Price class and in it's own method, so it can be easily overwritten.

if isinstance(self.price, PriceWithRoles):
for role in sorted(self.price.roles):
price = ET.SubElement(meal_element, 'price', {'role': role.name})
role_price = self.price.default + role.priceSupplement
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it make more sense to just specify a fixed price for every role instead of splitting into base and surcharge (which are actually the correct termini). Splitting could then be implemented by extending the Price class (see next comment).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Letting the price be either an int or a dict of Prices that support formatting their own XML is a good way of handling this.

@j-maas
Copy link
Collaborator Author

j-maas commented Feb 7, 2018

@klemens What is an example of an invalid feed that the model will allows? I don't understand exactly what you mean.

I based this on the regression test branch because I really need such a test to make sure I don't break anything. The test has saved me from a lot of bugs already. Before merging this PR, it does make sense to merge #63. In case we decide not to merge the other one, I will rebase this so that it doesn't include or reverts those commits.

@klemens
Copy link
Collaborator

klemens commented Feb 7, 2018

What is an example of an invalid feed that the model will allows?

Many of the elements and attributes have various constraints, eg no empty strings allowed, role must be one of pupil, student, employee, or other, etc.

See #40 where mswart said My goal would be to adjust PyOpenMensa to prevent all invalid XML generation.

@j-maas
Copy link
Collaborator Author

j-maas commented Feb 7, 2018

Yes, I totally forgot to do such checks! I think this kind of model lends itself quite easily to do such checks. I might implement them, depending on how the model is received.

About PyOpenMensa: I looked and I could only find an incomplete API2 model. That's why I bothered creating this at all. It is meant to be used instead of the PyOpenMensa BaseBuilder and (at least conceptually) to be part of PyOpenMensa.
But PyOpenMensa is so tightly coupled to this repo that I feel it makes sense to just put the model here, since PyOpenMensa sits inside here as a submodule anyway. But I'm totally fine with moving the model to PyOpenMensa.

Maybe we can take this example as a starting point to discuss whether it makes sense to integrate PyOpenMensa into this repo.

@klemens
Copy link
Collaborator

klemens commented Feb 7, 2018

I might implement them, depending on how the model is received.

Sounds reasonable!

I could only find an incomplete API2 model

It seems this is intended for querying the openmensa.org api. I haven't actually looked at the api, so I am not sure if we could use the same model both for creating and querying menus, but that certainly would be cool!

…whether it makes sense to integrate PyOpenMensa into this repo.

I think keeping them separate is a good idea, as one might use pyopenmensa for querying or writing a parser that is not part of this repo (e.g. because it is closed source because of an internal api).
We could also keep the api here for fast iteration and later move it to pyopenmensa.

I hope to find some time this weekend to take a closer look at the new parser.

@mswart
Copy link
Owner

mswart commented Feb 7, 2018

I am open to represent the feed with explicit domain models. In some cases this might simplify feed generation and allows simpler adjustments. The current Builder interface has been extended for the years and it is probably reasonable to split its feature and work more with composition instead of inheritance.
But for many parsers a simple interface should also be available. Only calling one central addMeal methods is good enough! LazyBuilder takes care of quite a few integration and validation. No need to require to deal with a whole bunch of objects.
I propose rewriting the Builder-classes to use the model classes under the hood (or the other way around) - no need to implement two ways of generate and validate feed data.

I still thinking about the class structure itself. The proposed change reimplements most of the models in aachen.model. Especially the convert_to_openmensa_model methods looks straight to my.
Maybe a framework should support an processor interface: define classes than convert the user input like prices or notes to the wanted base format. Even the helper functions from pyopenmensa.feed might have a better implementation as class.
The LegendProcessor be constructed with the legend data and its instance is later one used to process meal names and produce notes.
At the moment it is only a vague idea but I will think about it in the next days.

Also this code should probably live in pyopenmensa instead.

Yes, the model classes should be moved to pyopenmensa. Even some classes of utils like Parser and Source are generic and should be moved.

I could only find an incomplete API2 model

It seems this is intended for querying the openmensa.org api. I haven't actually looked at the api, so I am not sure if we could use the same model both for creating and querying menus, but that certainly would be cool!

Yes, some years ago I started working on an API library integration. I think it is still a good idea! But for now I will not continue this project.
If an integration is reasonable we could make it. But without many thoughts I am against it. Generating parser feeds and receiving canteen data are very different use cases. A combination might be possible now, but on the long hand it would probably produce conflicts.
They could share a common base system but should at least provide different API entry points: like feed.Canteen and api.Canteen.
The API integration is unfinished and to my knowledge unused, so lets not invest too much work.

…whether it makes sense to integrate PyOpenMensa into this repo.

I think keeping them separate is a good idea, as one might use pyopenmensa for querying or writing a parser that is not part of this repo (e.g. because it is closed source because of an internal api).
We could also keep the api here for fast iteration and later move it to pyopenmensa.

PyOpenMensa and openmensa-parsers are very different projects and should stay that way.
PyOpenMensa is a mostly stable library with sufficient test integration and official releases. It is used by openmensa-parsers but also otherwise developed and hosted parsers.
openmensa-parsers is more adapting bunch of parsers covered with basic integration testing to be continues deployed to one specific installation.

@j-maas
Copy link
Collaborator Author

j-maas commented Feb 7, 2018

I forgot that there are other parsers than the one in this repo. 😅 You're both right that separating this out makes sense.

It seems this is intended for querying the openmensa.org api.

I didn't really understand what it did, and apperently jumped to conclusions about it. But this PR's kind of model is still useful, I guess.

I still thinking about the class structure itself.

Me, too. Ideally, I would like to have one model as a "single source of truth/authority". Then we could use alternative models that can be converted to this one, as I tried with the Aachener parser.

I'm still not happy with how that alternative model turned out, though. Maybe we can figure out something elegant.

PR Separation

I think it makes sense to separate this PR into one about the model and one about the Aachener website change.

I might be able to tackle this tomorrow.

This was referenced Feb 8, 2018
@j-maas
Copy link
Collaborator Author

j-maas commented Feb 8, 2018

Split into #71 and #72.

@j-maas j-maas closed this Feb 8, 2018
@j-maas j-maas deleted the alternate-website branch March 2, 2018 22:12
@j-maas j-maas restored the alternate-website branch March 2, 2018 22:12
@j-maas j-maas deleted the alternate-website branch March 2, 2018 22:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants