-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Server] Explicitly sequence I/O operations on the underlying stream. #179
base: main
Are you sure you want to change the base?
Conversation
e098f5b
to
61d921e
Compare
Hey, thanks for opening this PR! As per #176 (comment), I'm marking this as "draft". As with any large PR, there's quite a bit to take in here. Especially since our team has somewhat limited resources in what we can review and implement, and we want to make sure we fully understand all aspects of it. For now I'm proposing we mostly treat it as a tool to help progress the discussion on #176, rather than treating it as something we're looking to merge as-is. This is still incredibly valuable since it helps us work towards a solution; but I want to make sure we're setting the right expectations. RefactoringIt seems part of this PR is refactors: re-ordering traits, changing TestingGiven this PR is intended to make progress towards #176, I was expecting there to be a new test case for this. Reading this PR there appears to be a test case for pipelined requests, but I don't believe there is one for the mid-way sending of Another thing that stood out to me was that other tests had been edited; in particular the removal of the closing newlines. I'm assuming that's because of a refactor? I wasn't expecting this to be changed at all, so I want to make sure it hasn't accidentally broken the protocol. Pipeline SupportWe explicitly don't support pipelining right now. We're aware that this is required by the HTTP/1.1 spec, but almost every client requires enabling this feature, and some don't even support it at all (like browsers). Most of this has been replaced by newer versions of the HTTP protocol instead. The only place I know it's still actively being used is in benchmarks like TechEmpower, which aren't a priority for us. Could you perhaps explain more on how you're addressing pipelining? We definitely would like to support it eventually, and better understanding the general architecture here would be useful. Is it required to split the reader and the writer? Or was that a convenience for the purpose of this patch? Removing the Clone boundI'm still unclear on how why the ConclusionThanks so much for putting in the effort into this PR! I feel we're coming back here with quite a few questions, but I'm hoping this can better help us understand what you're proposing, and in turn help improve the library. |
No problem, I wrote this mainly to prove that my
The I did not intend to do any unnecessary refactorings, although my editor auto-formats on save, so it's possible a change could have crept in there.
Yeah, I definitely need to add a bunch more tests.
Most of these issues do not rely on pipelining to occur, but they are much easier to reproduce reliably with pipelining (since otherwise you rely on getting the timing just right).
These test inputs were incorrect: the tests previously passed due to the "over-reading" bug which this PR fixes. When not using chunked encoding, there should not be a newline after the request body. Once I removed the "over-reading" behaviour, the tests started failing since they expected either another request, or for the stream to be closed, but instead encountered a newline. There was also a case (in server_decode::chunked) where I had to add a newline that was missing.
Gaining pipelining support is more a side-effect of this PR: fixing the sequencing issues means that pipelining "just works". Nevertheless, the sequencing issues are a problem on their own regardless of pipelining.
Splitting the reader/writer is necessary to be able to sequence them separately: we want reads to be ordered with respect to each other, and we want writes to be ordered with respect to each other, but there is no need to enforce an ordering between reads and writes.
Absolutely! I'll be adding more tests for these, and I'll create a branch to show why these fail without the changes too. |
ddbd432
to
1e4f71e
Compare
Argh, the |
With the |
I also had a look at how hyper solves this issue. In hyper, there is no public equivalent to the standalone Since hyper is calling into user-code rather than the other way around, the user doesn't need to be aware of the underlying complexities around when the streams can be read/written. |
Hi, from my point of view, this PR before all fixes chunked decoding in general: when I switched ipp.rs to surf /async_std, streams received from the printer started to be truncated in approx. 8 cases from 10. The printer sends chunked response, so I debugged it and found that async-h1/ChunkedDecoder prematurely ends at: Evidently, poll_read at line 277 would read more, but it cannot, as exhausted buffer ( Then I found this PR, so I stopped the debugging and fixing, because refactored ChunkedReader decodes stream differently, assuming in better way. Also, when I was looking around how to resolve my issues, I first found http-rs/surf#53. Maybe this issue would be resolved with this PR as well. |
The
Clone
bound is both problematic for users, and leads to bugs internally as well (#175, #176).This PR keeps existing public interfaces (mostly) unchanged, but adds two new functions
Server::new_rw
anddecode_rw
which operate on split, "sequenced" streams.Server::new
anddecode
are altered to be simple wrappers around these_rw
variants.The
_rw
methods requireBufRead
instead ofRead
which prevents several bugs where, due to the use of a localBufReader
, the code could accidentally read further than it meant to. This was obviously a blocker for supporting pipelined requests, but could also be an issue if the client sends a new request before the background task which reads the body has finished.The
decode
is still affected by this bug and cannot be fixed, so I would suggest encouraging a move todecode_rw
instead. (A half-way move of just adding theBufRead
bound could work, but it may be difficult for users to satisfy bothBufRead
&Clone
bounds, as these are typically mutually exclusive).The
Sequenced<T>
wrapper allows a reader/writer to be split into two along the time dimension, so that operations on the later half block waiting for the first half to be released. This ensures that at any given time there is a single reader and a single writer, no matter how many times the stream is split.Whilst implementing these changes I found an issue with the decoder for chunked streams: it had internal buffering, meaning that it would happily read past the end of the stream if more data was available. This was also a blocker for pipelined requests, so I re-implemented the decoder to use
BufRead
instead. The new decoder can rely on the buffering of the underlying stream, and so also avoids needing to do any memory allocation (except for sending trailers).Finally, there was a bug when a client sends the
Expect
header, but the server does not request the body. In that case, the server would still try to skip over the body even though the client will be starting the next request. I added a test for this, and also a test for pipelining.The dependencies added to
Cargo.toml
already existed as transitive dependencies, so should not increase the total number of dependencies.