Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🧵 Threadless network transports #119

Merged
merged 40 commits into from
Apr 20, 2021
Merged

Conversation

danstiner
Copy link
Collaborator

@danstiner danstiner commented Apr 6, 2021

This makes the network transport layer very dumb, the media/control connection classes now have threads that monitor and drive the streams.

This has a few benefits:

  1. Easier to test and debug hopefully
  2. Stopping a stream is now non-blocking
    • Stop() methods now ask the jthread to stop, it asynchronously does and then calls the onClose callbacks. This should be more reliable than making the original thread which asked for the stop to do all that work (which had often lead to deadlocks)
  3. Destructors on the transport and connection classes gracefully teardown

I also did a bit of cleanup to use the more modern scoped_lock, etc.

Ready for initial reviews but there's still a couple TODOs and I'd like to do more testing before declaring it ready to merge.

@danstiner danstiner requested a review from haydenmc April 8, 2021 06:52
@danstiner danstiner marked this pull request as ready for review April 8, 2021 06:56
Copy link
Member

@haydenmc haydenmc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me so far :)

src/FtlStream.cpp Show resolved Hide resolved
@danstiner
Copy link
Collaborator Author

danstiner commented Apr 14, 2021

There's definitely a regression here where when I first connect it fails and if I connect again quickly it works, but if I wait a bit it fails again, unsure why

@danstiner
Copy link
Collaborator Author

danstiner commented Apr 18, 2021

The connect fail was a race condition because the jthread member was before the regex members so I think sometimes the CONNECT regex would not be initialized in time and the match would always fail.

There is another issue with the shutdown order, it SEGV's:

Apr 17 23:22:30 ftl-ingest-2 janus[669131]: [2021-04-17 23:22:30.922] [debug] Stopping control connection thread for Channel 1
Apr 17 23:22:30 ftl-ingest-2 janus[669131]: [2021-04-17 23:22:30.922] [debug] FtlServer::onStreamClosed queueing StreamClosedEvent event
Apr 17 23:22:30 ftl-ingest-2 janus[669131]: [2021-04-17 23:22:30.922] [debug] FtlServer::eventStreamClosed processing StreamClosed event...
Apr 17 23:22:31 ftl-ingest-2 janus[669131]: [2021-04-17 23:22:31.107] [debug] Stopping media connection thread for Channel 1 / Stream 610
Apr 17 23:22:31 ftl-ingest-2 janus[669131]: [2021-04-17 23:22:31.107] [error] Media connection closed unexpectedly for channel 1 / stream 610
Apr 17 23:22:31 ftl-ingest-2 systemd[1]: janus.service: Main process exited, code=killed, status=11/SEGV

@danstiner
Copy link
Collaborator Author

Fixed the SEGV, think this might be ready to merge.
It's a big enough change might be worth baking for awhile on the test ingest server first though?

src/ConnectionCreators/UdpConnectionCreator.cpp Outdated Show resolved Hide resolved
src/ConnectionListeners/TcpConnectionListener.cpp Outdated Show resolved Hide resolved
src/ConnectionTransports/ConnectionTransport.h Outdated Show resolved Hide resolved
test/unit/FtlControlConnectionUnitTests.cpp Outdated Show resolved Hide resolved
test/unit/FtlControlConnectionUnitTests.cpp Show resolved Hide resolved
test/unit/Utilities/UtilTest.cpp Outdated Show resolved Hide resolved
@danstiner
Copy link
Collaborator Author

Thanks for working through this big change with me. It works great on my home lab, think it's time to merge!

@danstiner danstiner merged commit 133c867 into master Apr 20, 2021
@danstiner danstiner deleted the danstiner/threadless-transports branch April 20, 2021 23:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants