Skip to content

Retries on Network Error

Stephan Bösebeck edited this page Aug 23, 2013 · 1 revision

Write Concern is not enough

The write concern aka WriteSafety-Annotation in morphium is not enough for being on the safe side. the WriteSafety only makes sure, that, if all is ok, data is written to the amount of nodes, you want it to be written. You define the safety level more or less in an Application point of view. This does not affect networking outage or other problems. Hence, you can set several retry-Settings...

retry settings in Writers

Morphium has 3 different types of writers:

  • the normal writer: supports asynchronous and snychronous writes
  • the async writer: forces asyncnhrounous writes
  • the buffered writer: stores write requests in a buffer and executes those on block

This has some implications, as the core of morphium is asynchrounous, we need to make sure, there are not too many pending writes. (the "pile" is determined by the maximum amount of connections to mongo - hence this is something you won't need to configure) This is where the retry settings for writers come in. When writing data, this data is either written synchronously or asynchonously. In the latter case, the requests tend to pile up on heavy load. And we need to handle the case, when this pile gets too high. This is the retry. When the pile of pending requests is too high, wait for a speicified amount of time and try again to queue the operation. If that fails for all retries - throw an exception.

Retry settings for Network errors

As we have a really sh... network which causes problems more than once a day, I needed to come up with a solution for this as well. As our network does not fail for more than a couple of requests, the idea is to detect network problems and retry the operation after a certain amount of time. This setting is specified globally in morphium config: ´´´java morphium.getConfig().setRetriesOnNetworkError(10); morphium.getConfig().setSleepBetweenNetworkErrorRetries(500); ´´´ This causes morphium to retry any operation on mongo 10 times (if a network related error occurs) and pause 500ms between each try. This includes, reads, writes, updates, index creation and aggregation. If the access failed after the (in this case) 10th try - rethrow the networking error to the caller.