Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eGo Performance #124

Open
maltesc opened this issue Sep 5, 2018 · 12 comments
Open

eGo Performance #124

maltesc opened this issue Sep 5, 2018 · 12 comments

Comments

@maltesc
Copy link
Contributor

maltesc commented Sep 5, 2018

The recently performed eGo runs show that eGo has a performance problem. Especially the simulation of some MV grids is extremely time-consuming and delays the overall calculation.

Therefore ideas are welcome in order to improve eGo's performance...

@maltesc
Copy link
Contributor Author

maltesc commented Sep 5, 2018

Currently I am working on improving the logging messages in order to (at least) give the user a better overview on where the calculation is stuck...

@ulfmueller
Copy link
Member

ulfmueller commented Sep 5, 2018

Do you see a possibility to exclude certain long-lasting problematical mv grids on the run (after a threshold of calculation hours) and create an overall result from the remaining mv grids? Or would there be a new mv clustering necessary to get the correct weighting of the new number of mv grids considered?

@maltesc
Copy link
Contributor Author

maltesc commented Sep 5, 2018

To you see a possibility to exclude certain long-lasting problematical mv grids on the run (after a threshold of calculation hours) and create an overall result from the remaining mv grids.

I think this is possible. For understanding: In case one ore more MV grids fail to calculate during an eGo run, the overall results are extrapolated. The extrapolation takes into account the weighting of the grids. This means, if a grid with a very low weighting fails to calculate, this will have only very little effect on the final results.

Based on this idea I would say it is OK to define maximum calculation times per MV grid. However, instead of defining fixed values, I suggest to determine a maximum calculation time per grid based on the grids weightings. In my opinion, a grid with a small weighting should get a smaller maximum calculation times than grids with many representatives.
I assume that the most time-consuming grids are rare grids with very few representatives (@ulfmueller can you check this?). In this case they have little effect on the final results and could be skipped without any problems...

@maltesc
Copy link
Contributor Author

maltesc commented Sep 5, 2018

What could be a reasonable maximal calculation time per grid? Or rather, a reasonable calculation time per cluster percentage, in case you agree on the suggested method

I suggest to determine a maximum calculation time per grid based on the grids weightings.

A value could be like 100 % cluster percentage in 24 hours - the time is then allocated to the clusters respectively...

@ulfmueller
Copy link
Member

difficult question. It also depends on the problem size i.e. the size of the given etrago problem. I would leave this decision to the user and write the max_calc_time_per_grid into the settings.

@ulfmueller
Copy link
Member

ulfmueller commented Sep 5, 2018

I assume that the most time-consuming grids are rare grids with very few representatives (@ulfmueller can you check this?)

I can try. But only for a sample...

@ulfmueller
Copy link
Member

Your assumption seems to be valid. At least for the grid 2583 which took over 30 hours in one of my calculations it was only representing 16 grids. For the others I have trouble saying it because the logs do not really tell me when which grid started and ended its calculation.

@ulfmueller
Copy link
Member

Thinking of it, I really like this feature. When do you think can you have this ready?

@maltesc
Copy link
Contributor Author

maltesc commented Sep 5, 2018

This I think I can do this until tomorrow (Thursday) at 12:00.
The question remains if the time should be defined per grid (assuming that every grid is equally important) or rather per cluster_percentage (in order to give more time to more important grids)...

@ulfmueller
Copy link
Member

difficult question. Both is interesting. If it is relatively defined you might loose a grid because of its low weighting getting a ridiculous low calculation time. And other grid are anyhow still running and you have a cpu available and could give this less important grid still some time. On the other hand a long-lasting grid could occupy a cpu for a long time without having a lot effect. Would it be possible to start with the grids which are most important and give the less important ones only more time if important ones are anyhow still running? maybe to complicated...

@ulfmueller
Copy link
Member

sorry for not giving you a clear answer. Choose the one you like most (or do both). I can find pros and cons for both approaches...

@maltesc
Copy link
Contributor Author

maltesc commented Sep 6, 2018

Unfortunately I didn't manage to implement exactly the desired functionality. It seems to be difficult to add further sub-processes (that could be equipped with a individual timeout) while using SQL Alchemy sessions. However, I think I found a reasonable solution:

First, grids are now processed in a sorted order with descending importance. Thus, the most representative grids (highest cluster weighting) are processed first.

Furthermore, the scenario_settings now contain a parameter called "max_calc_time". With this parameter the total calculation time of all MV grids can be limited. Actually, in the current version, the time starts "counting" only, as soon as no more grids are in the queue for being processed. It thereby aims at the very last (and unimportant) grids in order to reduce their available time. However, I can also change this to be the total calculation time...

Furthermore, eGo now writes status files, containing an overview on the running calculation. Thereby it should be easier to trace a running calculation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants