-
-
Notifications
You must be signed in to change notification settings - Fork 531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a "streaming API" for incoming frames #296
Comments
Take 1 I'm not sure how well that use case fits the websocket protocol. HTTP was designed to transfer documents, that is, HTML files, and is widely used to transfer files. Fetching a file and storing it locally is a fairly reasonable use case. If you're transferring a single file over the lifetime of the a websocket connection, you could just as well use a HTTP GET (for dowload) or POST (for upload). If you're transferring multiple files, you're going to need a way to carry metadata across the wire and to delimit file transfers. The best fit for websocket I can think of is appending some data to a file — i.e. writing logs. In that case, writing a loop that reads messages and writes them seems reasonable. Take 2 The websocket protocol provides a way to split a large message across multiple frames: fragmentation. Currently I'm not coming up with an actionable way to frame this discussion... to be continued! |
Thanks for your thoughts. The use case I had in mind was--
Since a websocket connection was already established, I was thinking it would be easiest to re-use that connection. Your suggestion in "Take 1" is to receive the files "out of band." It seems that suggestion would require either running another server on a different port, as you suggest here, or else processing the POST "manually" in websocket's HTTP hook, which might be a pain (I'm not sure yet). Independent of the question of file uploads, I think it would still be worth discussing whether an API that exposes a stream over a bytes return value would be useful. Maybe there are reasons why such an approach wouldn't end up saving anything significant, or maybe not. |
I'll just add here that plain old HTTP already offers plenty of features aimed squarely at file transfer:
The beauty of websockets is that you already have HTTP. So really, as far as file transfer goes, I would really advise against reimplementing using websockets what HTTP already offers. Now about the streaming API you mention, I think it may be overkill. Not only that, but browsers (for which WebSockets was created in the first place) don't have any sort of streaming API: plain method calls for Cheers |
I think you may be misunderstanding what I'm suggesting. I'm not suggesting "streaming" in the sense of a use case. I'm suggesting the idea of the library's Here is one example in the code where a bytestring is being created in memory and could perhaps benefit from a file-like object:
Actually, if we're speaking of the websockets package, you don't really. For example, from websockets' documentation:
So, many of the features you have in mind likely aren't present in the library. What's driving this issue in part is the possibility of doing simple file transfers within the websockets package without having to add the complexity of a full-blown, heavy-weight HTTP server. |
In the "send" direction, one probably relatively easy thing to do would be to update the |
Yes, The line you're quoting above is in change of reassembling fragmented messages. This is separate from the premise of this discussion, which is about "assembling frames". However I think that's the right level to discuss this. Let's not invent an additional fragmentation mechanism over multiple messages, there's already one over multiple frames. I'm interested in investigating smarter ways to handle reassembly of fragmented messages. In fact the RFC hints at this possibility:
That could be quite hard to fit into the current architecture, though. |
Yes, I wasn't suggesting adding anything to assist with multiple messages. I do need to familiarize myself with fragmented messages, though. But either way, wouldn't that line be affected by a "receive" API capable of returning a bytes-like object -- the idea being that you wouldn't need to join individual byte strings to create a larger one if you're dealing with bytes-like objects? |
Yes, that line would need to change if we provided a streaming API for incoming fragmented messages. Fragmentation in WebSockets is pretty simple:
The non-fragmented case follows the same rules; there's only one frame which is both the first and the last one. |
By the way, I could still be confused about what you have in mind because there is a bit of ambiguity in the phrase "assembling frames." It can be interpreted to mean either "assembling multiple frames to form a single message" or "assembling frames [from their parts]." (The latter interpretation could be what the implementation note is getting at where it refers to partial frames: "a receiver doesn't have to buffer the whole frame in order to process it.") What also makes it confusing is that the phrase "fragmented messages" has a similar ambiguity. It can be interpreted to mean either a single message fragmented into multiple frames, or an end-user dividing a single "message" (in the broad sense of the word) into multiple websocket messages. (The latter is what I was agreeing this issue shouldn't be about.) If I'm interpreting things correctly, the idea would be to possibly expose each frame as a bytes-like object, and also expose each message as a bytes-like object (which internally could be implemented by accessing the bytes-like objects of each frame in sequence). Also, with this approach, I believe |
Ugh, I can't make sense of what I wrote yesterday, I must have swapped some words, sorry :-( Let me try again:
Your interpretation is correct anyway. Since
|
Right, that's one of the approaches that occurred to me, too. |
Also, thanks for clarifying.
I just want to clarify / add that the "IMPLEMENTATION NOTE" you quoted above suggests this can be taken a step even further, namely by handling "partial frames" (i.e. as a frame is coming in). This is different in that it would also affect cases where the message isn't fragmented but is coming in as a single frame. So even the individual frame itself wouldn't need to be assembled in memory (if I'm interpreting that portion of the RFC correctly)... |
By the way, this recent thread (Oct. 18 with subject "APIs for high-bandwidth large I/O?") on the async-sig list might be of interest: |
Changing the title to reflect where the discussion took us. |
The discussion here got quite long. In order to make it easier to move forwards, I split it into smaller issues.
I left aside partial read of frames (the IMPLEMENTATION NOTE I quoted from the RFC) because partial reads don't feel natural in I hope I didn't miss anything major. If I did, let's open additional issues. |
I'm wondering about the use case of transferring potentially large files over a websocket, where in the end the file would get written to the file system.
Currently, it seems the only way to do this with
websockets
's current API is to write a bunch of (large-ish) bytestrings to the file system (of sizemax_size
) as they are read usingrecv()
. I'm wondering if this approach seems perfectly fine, or if it would make sense forwebsockets
to expose some kind of streaming API to bypass the creation of the intermediate bytestrings. This would be analogous torequests
's "streaming" mode.The text was updated successfully, but these errors were encountered: