-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide runProcess' that returns allows returning a result? #306
Comments
If each handler in the wai app is single threaded (which it must be I'm guessing), then a more idiomatic approach would be to spawn a sibling process that is linked (as in thread linked, not process linked) to the handler thread. Spawn it with a TChan (or stm equivalent) or TQueue if you want to handle a backlog of requests, and use this to communicate with the Cloud Haskell process. If you need replies then pass two queues/channels. The sibling process would just sit around going You can tag messages if you need to sort the replies out. Cloud Haskell is fundamentally a messaging passing paradigm, so try not to fight that and make it synchronised on calls. |
So here is an example of a similar pattern, from the distributed-process-async package. This code is the heart of the implementation, and simply spawns a process that it happens to share a A similar example from ManagedProcess. We start with an enclosing type that abstracts the underlying communication pattern: -- | Provides a means for servers to listen on a separate, typed /control/
-- channel, thereby segregating the channel from their regular
-- (and potentially busy) mailbox.
newtype ControlChannel m =
ControlChannel {
unControl :: (SendPort (Message m ()), ReceivePort (Message m ()))
}
-- | Creates a new 'ControlChannel'.
newControlChan :: (Serializable m) => Process (ControlChannel m)
newControlChan = newChan >>= return . ControlChannel
-- | The writable end of a 'ControlChannel'.
--
newtype ControlPort m =
ControlPort {
unPort :: SendPort (Message m ())
} deriving (Show)
deriving instance (Serializable m) => Binary (ControlPort m)
instance Eq (ControlPort m) where
a == b = unPort a == unPort b
You can of course embed a non-cloud-haskell thing in there, such as We need to pass this to our server process, but we also want the call site to have access to the write end of the channel (but not the read end, obviously!), so we take an expression from the control channel to a process definition. chanServe :: (Serializable b)
=> a
-> InitHandler a s
-> (ControlChannel b -> Process (ProcessDefinition s))
-> Process ()
chanServe argv init mkDef = do
pDef <- mkDef . ControlChannel =<< newChan
runProcess (recvLoop pDef) argv init The canonical pattern I've come across involves using vanilla concurrency constructs like MVar, TMVar, TChan, etc, to coordinate between code that runs in the Another simplifying factor can sometimes be to spawn your non-cloud-haskell threads/work/etc from cloud haskell, instead of the other way around. There's no rule that says a cloud haskell process can't lift to Here are some examples from various interesting places (which you might find instructive in the various cloud haskell code bases):
There are many many more in the various test suites. Also, you can always write a custom back end for managed process (or perhaps more simply, a pool backend in the vein of this sort of thing). Point I am making is that there will be ways to cohesively integrate your CH and non-CH code, without coupling them in the wrong ways, and without always needing to spin up a process to make synchronous calls to a single server. I would work through the various code bases and crib them for ideas, then experiment and benchmark your approaches. If you come up with ways to integrate and supervise processes from This has been raised before too - I'll look up the original thread. |
What an amazing response, thank you so much! I feel that this could be made into a tutorial "interacting with non-CH code" almost verbatim. I certainly would have found a section on that very useful. I can start a PR for https://github.com/haskell-distributed/haskell-distributed.github.com if you want? |
I did end up with a single In practice I will just pretend that type ProcessChannel = TMVar (TMVar HttpResponse, HttpRequest)
initialProcessChannel :: IO ProcessChannel
initialProcessChannel = newEmptyTMVarIO
rpcProcess :: ProcessChannel -> Process ()
rpcProcess processChannel = forever $ do
(replyVar, httpRequest) <- liftIO $ atomically $ takeTMVar processChannel
let nid = NodeId (EndPointAddress "127.0.0.1:8081:0")
response <- DClient.call (nid, "http" :: String) httpRequest
liftIO $ atomically $ putTMVar replyVar response |
It's tough to know how to improve on that without better understanding how your code works. The way Web servers typically work in erlang is that each request is mapped to a spawned process, and of course you can pool them to minimise on allocations etc. Obviously you don't want to be fighting against whatever threading/muxing model wai uses, so that might not be practical. |
Having read up on the architecture of WAI, I don't think there's a neat way to make the handler a process, nor do I think there's a point in doing that either. The user-thread-per-request model clearly works well (with the parallel I/O manager in play) and I don't see the point in fighting what's clearly good. I'm really unconvinced by that code snippet though - it just looks wrong. Why are you making a blocking remote call... I mean, sure, lots of code goes off and talks to a database or whatever mid-request, so in principle it's fine, but I suspect there are neater ways of doing this. Also, is that
That would be great, thank you! We will probably need to add quite a lot to it. I did start looking at the snap http library, since it uses streams, and I wondered if it would be easier to fork a process and incrementally give back the results. The way Erlang's cowboy web server works is that it uses an underlying TCP socket acceptor pool, ranch, which has supervised acceptor processes handle incoming connections, then spawning a new process to handle the protocol layers for each one. I guess the thing is that these kinds of design/architectural decision tend to vary a lot depending on lots of factors. There is, for example, a network-transport-websockets implementation, which means it's quite possible to front a web server, use websockets to communicate with cloud haskell, and back again - since the n-t layer does that it should work fine for CH code. Also, introducing brokers and routing processes is often very useful intra-node. When we get into distributed stuff then it gets much more complex. Fun, but complex. :) |
( OK yea my code snippet isn't good. The main problem (other than it being synchronous) though is that error handling in CH is quite different. E.g. After going around in circles a few times I once again think some form of synchronous communication API that lives in |
That's absolutely wrong @teh, |
When you monitor a process, we absolutely fire that monitor if/when we detect the death of a peer node. The only issue you might have is that if no data is written between the two nodes, that disconnect might not be seen for quite some time. One solution to that (as per Erlang) is to have a heartbeat, which is very easy to implement in theory and I started adding an implementation in distributed-process-extras - however there are some complexities.... The way Erlang's net_kernel works, you have a tick sent to all connected nodes on the normal communication backbone every interval N. If a tick isn't seen from a peer within N*4 then the node is considered down. One problem with this approach is that the tick messages can get stuck behind other, arbitrarily large payloads. That is not good. One approach to solving that problem could potentially be to open up a second connection between the nodes and use that in isolation for the tick messages, but now we're just papering over a design issue - whether you consider it to be a network-transport issue or a cloud haskell one. At the end of the day if it's the interface (rather than just the socket between two node controllers) that gets saturated, then you're potentially in for a bad day. I don't think the node controller should do any ticking at all, and ideally you'd want the network-transport layer to handle this kind of thing (and smartly, separating its control plane from data transfer). There is currently no sane way for a process to tell its own node controller that it thinks a node is down either. Ultimately, the simple approach right now is to heartbeat process that simply keeps track of connected nodes and regularly calls Please let me know if you run into issues with |
@teh - see haskell-distributed/distributed-process-extras@48984ef, which should ensure that when resolving a remote name fails, it doesn't break I will probably pull the exception handling up, since any Can I suggest we close this ticket and open a new one to discuss making interactions between non-CH and CH code easier? Also please see haskell-distributed/distributed-process-client-server#9. |
I'm new to h-d so I'm rising an issue for discussion before starting to submit PRs. Maybe I missed something obvious.
I propose to add a new function
runProcess' :: LocalNode -> Process a -> IO a
.My main use case is interacting with the h-d ecosystem from other places, e.g. I have wai
Application
that then runscall
to call some stuff. In this specific case I use anMVar
to get the result but, but I have this pattern repeating in a few other places as well.The text was updated successfully, but these errors were encountered: