Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Neo connection handler slowdown with high number of scheduler worker fibers #376

Open
gavin-norman-sociomantic opened this issue Dec 13, 2018 · 0 comments

Comments

@gavin-norman-sociomantic
Copy link

gavin-norman-sociomantic commented Dec 13, 2018

(Moved to swarm from sociomantic-tsunami/dhtproto#183.)

An application that has been recently converted to use DHT neo Put instead of legacy Put is reportedly performing noticeably worse than before.

  • The app works by assigning a large batch of Put requests once, periodically.
  • In the legacy client, these would be pushed into the request queues (which were configured to be exceptionally large).
  • A Task is associated with each Put request, leading to a large pool of tasks.
  • When using neo Put, the application's Task pool overflows.

Tests with the affected app indicate that the task scheduler's number of worker fibers seems to be the key thing here: it was set high, and if it is reduced to 10 or 20 the time per Put request is greatly reduced. The likely reason is that with too high a number of worker fibers, it is hard to resume the send and receive fibers of the neo swarm client, hence data throughput is greatly reduced.

A simple test that spawns a Task for each request confirms this: the number of worker fibers definitely impacts the speed at which neo requests complete:

  • 5 worker fibers: average Put completion time ~85μs.
  • 10 worker fibers: average Put completion time ~175μs.
  • 20 worker fibers: average Put completion time ~400μs.
  • 100 worker fibers: average Put completion time ~2,000μs.
  • 1,000 worker fibers: average Put completion time ~18,000μs.

This problem is not specific to the DHT or to Put requests, of course, but indicates a general interaction between the connection send/receive fibers in swarm and the scheduler's worker fibers.

In practice, this only has a significant impact in test systems where the number of nodes in the DHT (hence the number of connections and send/receive fibers) is small relative to the number of worker fibers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant