-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ostrich process hangs for larger calibrations #219
Comments
Are you using |
|
This should be better documented in our docs. |
Hmmm I thought I did in all my notebooks, turns out this one did not. I'll relaunch with a larger size and keep you posted. Thanks! |
I just get this error in the notebook when I try to run the process with Progress=True: The save operation succeeded, but the notebook does not appear to be valid. The validation error was: Notebook validation failed: {'version_major': 2, 'version_minor': 0, 'model_id': '817877db674542ad8586f18de20711ba'} is not valid under any of the given schemas: |
After running the code on the Ouranos JupyterLab instance (on pavics.ouranos.ca/jupyter) with progress=True, OSTRICH does the same thing where it does not crash but has been running for over 12 hours and still no response whereas I expected the code to take ~1h20 mins. |
Ok. Could you confirm that if you run Raven on your own machine (from the terminal, no python wrapper or wps server), it works. |
Hi @huard. I was testing Richard's setup on my machine. For a 4747-day simulation with different budgets for OSTRICH took: budget 100 iterations: 10s I will contact Shawn Matott (developer of Ostrich) if he has an idea why runtime is not linear. |
OK, glad to see I'm not going crazy. thanks for the info! |
Yeah, I'm sorry about that. I normally don't use such large budgets and never realized. I am guessing that it has something to do with the increased memory allocation of Ostrich to hold all the statistics etc of the previous runs. But let's seee what Shawn says. Just sent out the email with the runtime stats and the example setup. :) |
Follow-up: It would seem that the process hanging also affects other birds that demand long run times. |
There is a known issue with PyWPS queue management. I'm hoping to make some progress on this front over the next months. |
I think this has actually nothing to do with Ostrich or Raven. Didn't we find that it is actually hanging in the WPS? |
I think the comment here refers to the Raven that includes DDS internally so we can accelerate the calibrations much faster and avoid this problem altogether. |
Ok. James has implemented DDS functionality in Raven. But:
It is a longer discussion, I think, if we really want to make use of this since all calibration settings would need to be divided into "Is Raven doing the calibration internally?" or "Is Ostrich doing the calibration?" The runtime of ALL Raven runs and hence also calibration runs can be significantly improved when the input data (forcings) are aggregated from gridded to HRU-aggregates using |
Thanks for the update. I suggest we close this issue here, since the PyWPS problem is described elsewhere. |
@huard Can you link to the PyWPS issue for posterity please? then we can close this one. Thanks! |
Ostrich seems to hang if we ask for larger number of model evaluations. Ex:
Calibrating on 100 model evaluations = 38 seconds.
Calibrating on 1000 model evaluations = 440 seconds.
Calibrating on 10000 model evaluations = Was still incomplete (no error, just running) after 16 hours.
This leads me to believe there is some sort of config limiting the duration of processes maybe? a sort of timeout?
The text was updated successfully, but these errors were encountered: