Akka.NET v1.5.27
1.5.27 July 25th 2024
Akka.NET v1.5.27 is a significant release that includes new features, mission-critical bug fixes, and some performance improvements.
Major Akka.Cluster.Sharding and Akka.Cluster.Tools.Singleton Bug Fixes
In all prior versions of Akka.NET, there are two high impact distributed systems bugs:
- Akka.Cluster.Tools.Singleton: singleton moves earlier than expected - as soon as new node joins
- Akka.Cluster.Sharding: duplicate shards / entities
As we discovered during the course of our pains-taking bug investigation, these were, in fact, the same issue:
- The
ClusterSingletonManager
is supposed to always belong on the oldest node of a given role type, but an original design error from the time Akka.Cluster.Tools was first introduced to Akka.NET meant that nodes were always sorted in descending order ofUpNumber
. This is backwards: nodes should always be sorted in ascending order ofUpNumber
- this means that the oldest possible node is always at the front of the "who is oldest?" list held by theClusterSingletonManager
. This explains why the singleton could appear to move early during deployments and restarts. - The
ClusterSingletonManager
was suspectible to a race condition where if nodes were shutdown and restarted with the same address in under 20 seconds, the default "down removal margin" used by theClusterSingletonManager
to tolerate dirty exits, it would be possible after multiple successive, fast, restarts for multiple instances of the singleton to be alive at the same time (for a short period.)
Both of these varieties of problem, duplicate singletons, is what lead to duplicate shards.
As a result we've made the following fixes:
- Akka.Cluster.Tools: deprecate ClustersSingletonManagerSettings.ConsiderAppVersion -
AppVersion
is no longer considered for singleton placement as it could easily result in split brains. - Akka.Cluster.Tools: fix mutability and oldest state bugs with
ClusterSingletonManager
- resolves the issue with rapid rolling restarts creating duplicates. We've tested this fix in our test lab across thousands of coordinator restarts and haven't been able to reproduce the issue since (we could easily do it before.) - Akka.Cluster.Tools.Singleton / Akka.Cluster.Sharding: fix duplicate shards caused by incorrect
ClusterSingletonManager
HandOver
- we fixed the member age problem here, which could cause a second singleton to start at inappropriate times.
Akka.Discovery and ClusterClient
Discovery Support
In Akka.NET v1.5.27 we've added support for using Akka.Cluster.Tools.ClusterClient alongside with Akka.Discovery plugins to automatically discover the initial contacts you need for ClusterClientReceptionist
instances in your environment.
You can read the documentation for how this works here: https://getakka.net/articles/clustering/cluster-client.html#contact-auto-discovery-using-akkadiscovery
Related PRs and issues:
- Akka.Discovery: Add multi-config support to config-based discovery
- Cluster.Tools: Fix missing VerboseLogging in ClusterClientSettings.Copy method
- Cluster.Tools: Improve ClusterClientDiscovery to avoid thundering herd problem
- Cluster.Tools: Change ClusterClientDiscovery to use the new Akka.Management "/cluster-client/receptionist" endpoint
Other Bug Fixes and Improvements
- Akka.Cluster: improve gossip serialization performance
- Akka.Streams: Fix
ActorMaterializerImpl
null
LogSource
- Akka.Streams:
AlsoTo
may not be failing graph when its sink throws exception - Akka.DistributedData: if
lmdb.dir
is null or empty, log a warning and set to default
To see the full set of changes in Akka.NET v1.5.27, click here.
COMMITS | LOC+ | LOC- | AUTHOR |
---|---|---|---|
15 | 835 | 1001 | Aaron Stannard |
12 | 1123 | 207 | Gregorius Soedharmo |
Changes:
- b34d645 V1.5.27 release notes (#7303)
- 6a81fd3 Akka.Cluster.Tools: deprecate ClustersSingletonManagerSettings.ConsiderAppVersion (#7302)
- 7180810 Akka.Cluster.Tools: fix mutability and oldest state bugs with
ClusterSingletonManager
(#7298) - 6b6afe4 Add AlsoTo downstream failure propagation support (#7301)
- 7811504 Fix ActorMaterializerImpl null LogSource (#7300)
- f2e81c5 Akka.Cluster.Tools.Singleton / Akka.Cluster.Sharding: fix duplicate shards caused by incorrect
ClusterSingletonManager
HandOver
(#7297) [ #6973, #6793, #7196 ] - f4501e8 migrated
ClusterSingletonManager
toswitch
statements (#7296) - 9eb670a
ShardCoordinator
: misc cleanup (#7295) - a6daed7 remove unused
reference.confg
from DistributedData.LightningDb (#7294) - d3b3613 DData: if
lmdb.dir
is null or empty, log a warning and set to default (#7292)
See More
- 50ef7b9 Akka.Cluster: added sanity check for
Member.AgeComparer
(#7291) [ #6973 ] - f7f9078 Akka.Cluster.Sharding: enable
prefer-oldest
by default onReplicator
(#7290) - 4b7cff3 Akka.Cluster.Sharding: added tests for DData consistency settings (#7288)
- 5176dfb Akka.Cluster: improve gossip serialization performance (#7281)
- 9981d83 Fix busted Windows Build stage (#7283)
- 7947c1f Optimize cluster message serializer benchmarks (#7282)
- 7fb0c2b Add
ActorBase
lifecycle flow unit tests (#7128) - 8843a21 Akka.Cluster:
ClusterMessageSerializer
benchmarks (#7280) - 9f5de84 Akka.Cluster: harden ignored gossip messages (#7278)
- 8f200ef Update RELEASE_NOTES.md for 1.5.27-beta2 release (#7277)
- 03a469f Fix ClusterClientDiscovery, missing port name in lookup (#7276)
- 67e7754 Update RELEASE_NOTES.md for 1.5.27-beta1 release (#7275)
- e4e4ca4 Improve ClusterClientDiscovery feature (#7274)
- 7f37465 Fix possible problems with ClusterClient Discovery (#7270)
- dab2a5e Update failure message syntax (#7273)
- 000d8fe Fix missing ClusterClientSettings.VerboseLogging in Copy method (#7272)
- 399c548 Update ConfigServiceDiscovery to support multi-config (#7271)
- 7136919 Added v1.5.27 placeholder
- f797f8d Update RELEASE_NOTES.md for 1.5.26 release (#7268)
This list of changes was auto generated.