You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Version Information
Version of Akka.NET? v1.5.37
Which Akka.NET Modules? Akka.Cluster.Sharding
Describe the bug
This is a pretty rare bug as far as I can tell - today was the first time I've ever seen this log message ever get logged in 12 years of working with Akka.NET:
Log.Warning("{0}: Shard [{1}] deallocation didn't complete within [{2}].",
TypeName,
m.Shard,
Settings.TuningParameters.HandOffTimeout);
Looking more closely at the issue, we see A LOT of unhandled HandOff messages over the course of 10-30 minutes:
2025-02-11 12:49:24.376 [INFO][02/11/2025 18:49:24.376Z][Thread 0003][akka://[REDACTED_SYSTEM]/system/sharding/clientsessions/[REDACTED_SESSION_ID]] Message [HandOff] from [akka.tcp://[REDACTED_SYSTEM]@[REDACTED_HOST]:[REDACTED_PORT]/system/sharding/clientsessionsCoordinator/singleton/coordinator/[REDACTED_ACTOR_ID]] to [akka://[REDACTED_SYSTEM]/system/sharding/clientsessions/[REDACTED_SESSION_ID]#[REDACTED_ACTOR_ID]] was unhandled. [86] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. Message content: HandOff([REDACTED_SESSION_ID])
2025-02-11 12:48:14.376 [INFO][02/11/2025 18:48:14.376Z][Thread 0007][akka://[REDACTED_SYSTEM]/system/sharding/clientsessions/[REDACTED_SESSION_ID]] Message [HandOff] from [akka.tcp://[REDACTED_SYSTEM]@[REDACTED_HOST]:[REDACTED_PORT]/system/sharding/clientsessionsCoordinator/singleton/coordinator/[REDACTED_ACTOR_ID]] to [akka://[REDACTED_SYSTEM]/system/sharding/clientsessions/[REDACTED_SESSION_ID]#[REDACTED_ACTOR_ID]] was unhandled. [44] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. Message content: HandOff([REDACTED_SESSION_ID])
2025-02-11 12:47:54.367 [INFO][02/11/2025 18:47:54.367Z][Thread 0033][akka://[REDACTED_SYSTEM]/system/sharding/clientsessions/[REDACTED_SESSION_ID]] Message [HandOff] from [akka.tcp://[REDACTED_SYSTEM]@[REDACTED_HOST]:[REDACTED_PORT]/system/sharding/clientsessionsCoordinator/singleton/coordinator/[REDACTED_ACTOR_ID]] to [akka://[REDACTED_SYSTEM]/system/sharding/clientsessions/[REDACTED_SESSION_ID]#[REDACTED_ACTOR_ID]] was unhandled. [74] dead letters encountered. Message content: HandOff([REDACTED_SESSION_ID])
2025-02-11 12:46:44.371 [INFO][02/11/2025 18:46:44.371Z][Thread 0033][akka://[REDACTED_SYSTEM]/system/sharding/clientsessions/[REDACTED_SESSION_ID]] Message [HandOff] from [akka.tcp://[REDACTED_SYSTEM]@[REDACTED_HOST]:[REDACTED_PORT]/system/sharding/clientsessionsCoordinator/singleton/coordinator/[REDACTED_ACTOR_ID]] to [akka://[REDACTED_SYSTEM]/system/sharding/clientsessions/[REDACTED_SESSION_ID]#[REDACTED_ACTOR_ID]] was unhandled. [46] dead letters encountered. Message content: HandOff([REDACTED_SESSION_ID])
2025-02-11 12:45:34.369 [INFO][02/11/2025 18:45:34.369Z][Thread 0033][akka://[REDACTED_SYSTEM]/system/sharding/clientsessions/[REDACTED_SESSION_ID]] Message [HandOff] from [akka.tcp://[REDACTED_SYSTEM]@[REDACTED_HOST]:[REDACTED_PORT]/system/sharding/clientsessionsCoordinator/singleton/coordinator/[REDACTED_ACTOR_ID]] to [akka://[REDACTED_SYSTEM]/system/sharding/clientsessions/[REDACTED_SESSION_ID]#[REDACTED_ACTOR_ID]] was unhandled. [7] dead letters encountered. Message content: HandOff([REDACTED_SESSION_ID])
2025-02-11 12:44:34.369 [INFO][02/11/2025 18:44:34.368Z][Thread 0016][akka://[REDACTED_SYSTEM]/system/sharding/clientsessions/[REDACTED_SESSION_ID]] Message [HandOff] from [akka.tcp://[REDACTED_SYSTEM]@[REDACTED_HOST]:[REDACTED_PORT]/system/sharding/clientsessionsCoordinator/singleton/coordinator/[REDACTED_ACTOR_ID]] to [akka://[REDACTED_SYSTEM]/system/sharding/clientsessions/[REDACTED_SESSION_ID]#[REDACTED_ACTOR_ID]] was unhandled. [48] dead letters encountered. Message content: HandOff([REDACTED_SESSION_ID])
2025-02-11 12:43:24.370 [INFO][02/11/2025 18:43:24.370Z][Thread 0025][akka://[REDACTED_SYSTEM]/system/sharding/clientsessions/[REDACTED_SESSION_ID]] Message [HandOff] from [akka.tcp://[REDACTED_SYSTEM]@[REDACTED_HOST]:[REDACTED_PORT]/system/sharding/clientsessionsCoordinator/singleton/coordinator/[REDACTED_ACTOR_ID]] to [akka://[REDACTED_SYSTEM]/system/sharding/clientsessions/[REDACTED_SESSION_ID]#[REDACTED_ACTOR_ID]] was unhandled. [17] dead letters encountered. Message content: HandOff([REDACTED_SESSION_ID])
2025-02-11 12:42:24.367 [INFO][02/11/2025 18:42:24.367Z][Thread 0003][akka://[REDACTED_SYSTEM]/system/sharding/clientsessions/[REDACTED_SESSION_ID]] Message [HandOff] from [akka.tcp://[REDACTED_SYSTEM]@[REDACTED_HOST]:[REDACTED_PORT]/system/sharding/clientsessionsCoordinator/singleton/coordinator/[REDACTED_ACTOR_ID]] to [akka://[REDACTED_SYSTEM]/system/sharding/clientsessions/[REDACTED_SESSION_ID]#[REDACTED_ACTOR_ID]] was unhandled. [62] dead letters encountered. Message content: HandOff([REDACTED_SESSION_ID])
2025-02-11 12:39:44.364 [INFO][02/11/2025 18:39:44.364Z][Thread 0010][akka://[REDACTED_SYSTEM]/system/sharding/clientsessions/[REDACTED_SESSION_ID]] Message [HandOff] from [akka.tcp://[REDACTED_SYSTEM]@[REDACTED_HOST]:[REDACTED_PORT]/system/sharding/clientsessionsCoordinator/singleton/coordinator/[REDACTED_ACTOR_ID]] to [akka://[REDACTED_SYSTEM]/system/sharding/clientsessions/[REDACTED_SESSION_ID]#[REDACTED_ACTOR_ID]] was unhandled. [83] dead letters encountered. Message content: HandOff([REDACTED_SESSION_ID])
2025-02-11 12:38:34.373 [INFO][02/11/2025 18:38:34.373Z][Thread 0023][akka://[REDACTED_SYSTEM]/system/sharding/clientsessions/[REDACTED_SESSION_ID]] Message [HandOff] from [akka.tcp://[REDACTED_SYSTEM]@[REDACTED_HOST]:[REDACTED_PORT]/system/sharding/clientsessionsCoordinator/singleton/coordinator/[REDACTED_ACTOR_ID]] to [akka://[REDACTED_SYSTEM]/system/sharding/clientsessions/[REDACTED_SESSION_ID]#[REDACTED_ACTOR_ID]] was unhandled. [47] dead letters encountered. Message content: HandOff([REDACTED_SESSION_ID])
2025-02-11 12:37:34.361 [INFO][02/11/2025 18:37:34.361Z][Thread 0024][akka://[REDACTED_SYSTEM]/system/sharding/clientsessions/[REDACTED_SESSION_ID]] Message [HandOff] from [akka.tcp://[REDACTED_SYSTEM]@[REDACTED_HOST]:[REDACTED_PORT]/system/sharding/clientsessionsCoordinator/singleton/coordinator/[REDACTED_ACTOR_ID]] to
This continues indefinitely.
To Reproduce
Not sure how to reproduce it yet.
Expected behavior
Shards should terminate their entities during a handoff and deallocate all entity actors.
Actual behavior
Not only did the shard not deallocate, but it looks like it didn't attempt to kill off any of its entity actors - otherwise the fail safe from the HandoffStopper should kick in:
This didn't happen, so it makes me think that the Shard got behavior-switched to a state where it couldn't receive HandOff messages long before actually attempting to hand off.
Screenshots
If applicable, add screenshots to help explain your problem.
Environment
Are you running on Linux? Windows? Docker? Which version of .NET?
Additional context
Happened when scaling the sharding system up to double its original node count
Custom entity handoff message was used
The text was updated successfully, but these errors were encountered:
Version Information
Version of Akka.NET? v1.5.37
Which Akka.NET Modules? Akka.Cluster.Sharding
Describe the bug
This is a pretty rare bug as far as I can tell - today was the first time I've ever seen this log message ever get logged in 12 years of working with Akka.NET:
akka.net/src/contrib/cluster/Akka.Cluster.Sharding/ShardCoordinator.cs
Lines 1850 to 1853 in 1f7ffa7
Looking more closely at the issue, we see A LOT of unhandled
HandOff
messages over the course of 10-30 minutes:This continues indefinitely.
To Reproduce
Not sure how to reproduce it yet.
Expected behavior
Shards should terminate their entities during a handoff and deallocate all entity actors.
Actual behavior
Not only did the shard not deallocate, but it looks like it didn't attempt to kill off any of its entity actors - otherwise the fail safe from the
HandoffStopper
should kick in:akka.net/src/contrib/cluster/Akka.Cluster.Sharding/ShardRegion.cs
Lines 315 to 324 in 6ffd304
This didn't happen, so it makes me think that the
Shard
got behavior-switched to a state where it couldn't receiveHandOff
messages long before actually attempting to hand off.Screenshots
If applicable, add screenshots to help explain your problem.
Environment
Are you running on Linux? Windows? Docker? Which version of .NET?
Additional context
The text was updated successfully, but these errors were encountered: