You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This error does not interrupt the whole process, right? And since it's rare, it must be not about something we can fix on our side.
I suggest that we do about the following about this:
Make it look a bit more accurate (not as a traceback):
in case of "unexpected response" (try to) extract 'error' response field and output it's value instead of a full traceback.
Pause the process and retry a bit (30-60 seconds) later.
Seems like in this case the error appeared in a short interval of time (2020-10-07 10:24:41-- 2020-10-07 10:24:52), so I believe it was due to restart of some service at the AMI server side.
BTW, if we get the AMI server response, the pyAMI client doesn't try to query another instance, while in this case it might be our resque :(
If the retry failed -- (properly) skip this message processing and go on:
mark the message as 'incomplete'.
The only problem here is that if the issue wasn't somehow fixed in a couple of minutes, and we're waiting for 30 seconds for every message passing through this stage -- the whole process will take very long time. Is it OK, or do we need some more elaborate scenario, like "if it fails N times in a row, stop retrying; just query AMI once for each new message and skip it if the problem's still here"?
root@aiatlas171:/var/log/dkb/data4es-hourly.log
The text was updated successfully, but these errors were encountered: