Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upload Tag extension #163

Merged
merged 22 commits into from
Oct 26, 2020
Merged

Upload Tag extension #163

merged 22 commits into from
Oct 26, 2020

Conversation

felix-schwarz
Copy link

@felix-schwarz felix-schwarz commented Sep 13, 2020

This PR is a continuation of #157, with the following key differences:

  • removal of Challenge extension to limit scope to just Client-Tag. The Challenge extension is now tracked in Challenge Extension #162.
  • add language requiring servers to reject unauthenticated requests including a Client-Tag (temporary, until Challenge Extension #162 is also available)
  • reorder headers in the examples for better readability

The Upload Tag Extension to the tus protocol proposed here allows clients to resume an upload even if they did not receive the response to the creation or creation-with-upload request with which they initiated the upload.

The mechanism for this is relatively simple:

  • with the Upload Tag Extension, clients can generate a Upload Tag (typically a UUID) and send it alongside the initial creation or creation-with-upload request.
  • if the connection is interrupted before the client receives the response with the URL of the created resource, the client can use the Upload Tag as a "backup" to retrieve the URL of the created resource.

Mobile client apps benefit of this extension in various ways:

  • by taking advantage of creation-with-upload, complete uploads can be handed off to OS-managed background HTTP request queues (that continue to run when the apps themselves are already paused / suspended) for execution as soon as network connectivity is next available - without anxiety regarding the reliability of the connection
  • if a creation-with-upload request doesn't complete, the Upload Tag Extension can be used to retrieve all information necessary to resume the upload right from the point where it was interrupted - not wasting a single byte (of expensive cellular data) of bandwidth to retransmit data that has already been transmitted. The upload will finish eventually - even on a very unreliable connection. And on a reliable connection, there's the opportunity for the upload to finish with maximum efficiency - with a single request.
  • without the Upload Tag Extension, the only reliable way for mobile apps to ensure a piece of upload data is sent only once would be to limit itself to use only the creation extension. Using that extension, however, generates a usage pattern of an initial (creation) request and an immediate (data upload) follow-on request - a usage pattern that is detected and penalized (with long delays) for background HTTP queues, by at least iOS.

In my proposal, I tried to match the language and style of the existing specification as closely as possible.

Feedback welcome! 🙂

felix-schwarz and others added 13 commits June 8, 2020 16:30
…ther upload with the same ID is already ongoing.
- clarify format of the Client Reference is not limited to UUIDs
- add additional error condition for when a Client Reference is not accepted by the Server
- add additional headers that can be sent in response to the OPTIONS request to indicate the server's expectations and behaviour
- rename Tus-Client-Reference-Validity to Tus-Client-Reference-TTL
- addition of a length limitation for Upload-Tags
- removal of Tus-Client-Reference-Foramt
- removal of Tus-Client-Reference-Validity
…rement for authentication if these aren't used; minor formatting improvements.
- updates to status codes for various conditions
- clarify that Upload-Tag-Secret should only be sent for POST requests and an error be returned if sent for other requests
- rephrase Upload-Tag-Secret definition and add base64 encoding length information
…d rename the respective headers (remove the "Tag" part).
- add language requiring servers to reject unauthenticated requests including a Client-Tag
- reorder headers in the examples for better readibility
smatsson added a commit to tusdotnet/tusdotnet that referenced this pull request Sep 16, 2020
@smatsson
Copy link

I put together a POC/draft implementation of Client-Tag here: https://github.com/tusdotnet/tusdotnet/tree/POC/Client-Tag

One can test it live at https://tustemp.azurewebsites.net/files using Postman or similar. Disclaimer: There might be bugs since it's a POC.

Restrictions in place:

  • The site requires basic auth but any combination of user/pass is accepted. The authentication is only used to make sure that only the user that created the client tag has access to it.
  • Unauthenticated users are not allowed (as per the current spec).
  • File sizes are limited to 1 MB
  • All files not written to in 30 minutes will be deleted (regardless of if it's completed or not)

Here is a list of things I found doing this implementation and possible solutions (in no particular order):

  1. I found it a bit confusing that the extension is called client-tag but the header is called upload-tag. Should we rename the header? Or is it just something to get used to?

  2. Returning 409 conflic on create leaks data on what tags are already used. Might not be an issue as you cannot access the tag anyway?

  3. I took the liberty of also implementing this for concatenation when multiple files are merged. It might be a small use case but I think it makes sense for consistency. We should change the text below

This allows Clients to resume an upload initiated through the Creation With Upload
or Creation extensions for which they did not receive the response with
the newly created resource's URL (for example due to a broken connection).

into something like:

This allows Clients to resume a created upload for which they did not receive the response with
the newly created resource's URL (for example due to a broken connection).

  1. Should we define what should happen if you try to do a HEAD request to the upload url without the Upload-Tag? In my reference implementation I chose to return 404 Not Found.

@Acconut
Copy link
Member

Acconut commented Sep 16, 2020

Thanks for moving this into a new PR, @felix-schwarz!

@Acconut
Copy link
Member

Acconut commented Sep 18, 2020

@felix-schwarz Would it be OK if I add some commits to this PR to show what I mean with some of my comments?

@felix-schwarz
Copy link
Author

Wow, thanks for your feedback @Acconut & @smatsson! 😍

I'm now working through your feedback from top to bottom and will commit a revised version taking your feedback into account once done.

@felix-schwarz Would it be OK if I add some commits to this PR to show what I mean with some of my comments?

@Acconut Sure, that'd be fine! Preferably after I pushed my changes, though, to avoid version conflicts.

@felix-schwarz felix-schwarz changed the title Client Tag extension Upload Tag extension Sep 23, 2020
@felix-schwarz
Copy link
Author

I put together a POC/draft implementation of Client-Tag here: https://github.com/tusdotnet/tusdotnet/tree/POC/Client-Tag

Awesome, thanks a lot!

Here is a list of things I found doing this implementation and possible solutions (in no particular order):

  1. I found it a bit confusing that the extension is called client-tag but the header is called upload-tag. Should we rename the header? Or is it just something to get used to?

Agreed. I'm changing the name to Upload Tag.

  1. Returning 409 conflic on create leaks data on what tags are already used. Might not be an issue as you cannot access the tag anyway?

As long as Upload Tags are tracked on a per-user basis, this shouldn't be a concern:

  • if the same user tries to use the same Upload Tag twice, it's legitimate error reporting. And where an intruder with credentials is a scenario to protect against, the Challenge Extension could be used to ensure the intruder would only get a 404 response even when using a "live" Upload Tag - and no data on which Client Tags are already in use would be leaked.
  • if it's used by user A and B, no conflict would be reported - and both could be using the same tag for different uploads. 409 would not be returned in that case
  1. I took the liberty of also implementing this for concatenation when multiple files are merged. It might be a small use case but I think it makes sense for consistency. We should change the text below

This allows Clients to resume an upload initiated through the Creation With Upload
or Creation extensions for which they did not receive the response with
the newly created resource's URL (for example due to a broken connection).

into something like:

This allows Clients to resume a created upload for which they did not receive the response with
the newly created resource's URL (for example due to a broken connection).

Good catch! I'll update that to:

This allows Clients to resume an upload they initiated, but for which they did
not receive the response with the newly created resource's URL (for example
due to a broken connection).

  1. Should we define what should happen if you try to do a HEAD request to the upload url without the Upload-Tag? In my reference implementation I chose to return 404 Not Found.

Right now the proposal doesn't require the inclusion of Upload-Tag headers with all follow-up requests - and I'm hesitant to include one. Some implementations might choose to use the Upload-Tag solely to recover the Upload URL with a single HEAD request - and then continue to use the Upload URL instead of the Upload Tag.

@felix-schwarz
Copy link
Author

@felix-schwarz Would it be OK if I add some commits to this PR to show what I mean with some of my comments?

@Acconut Sure, that'd be fine! Preferably after I pushed my changes, though, to avoid version conflicts.

I've just pushed all of my existing changes, so please go ahead.

@smatsson
Copy link

  1. Should we define what should happen if you try to do a HEAD request to the upload url without the Upload-Tag? In my reference implementation I chose to return 404 Not Found.

Right now the proposal doesn't require the inclusion of Upload-Tag headers with all follow-up requests - and I'm hesitant to include one. Some implementations might choose to use the Upload-Tag solely to recover the Upload URL with a single HEAD request - and then continue to use the Upload URL instead of the Upload Tag.

I think I was sloppy in my question here. For a HEAD request for e.g. /files (note no file id), should we have a predefined response if the Upload-Tag is missing? If an upload tag is provided during creation the Client can do one of the two following requests to get the file info:

HEAD /files/24e533e02ec3bc40c387f1a0e460e216 HTTP/1.1
Host: tus.example.org
Tus-Resumable: 1.0.0

OR 

HEAD /files/ HTTP/1.1
Host: tus.example.org
Tus-Resumable: 1.0.0
Upload-Tag: myTag

If a request for the latter is received without a Upload-Tag header, what's the best way to handle it? 404? 400?

@Acconut
Copy link
Member

Acconut commented Sep 29, 2020

I made a few changes to the PR:

  • Adjusted the wording, formatting and phrases to match the rest of the document
  • Added instructions what error codes the server must return if the headers are missing or invalid (as suggested by @smatsson)
  • Adjusted the limit of the Upload-Tag header to be 256 characters
  • Made the authorization of users optional instead of mandatory

Let me explain reasons for the last change: In its current form, the tus protocol does not require authentication at any point, which was a decision we made intentionally. There are applications were authentication is not needed and I would like tus server in these circumstances to also be compatible with the standard. For example: The demo server at https://master.tus.io does not require authentication because it is a public demo server. Any data uploaded there should be considered publicly available, so authorization is not needed. In fact, authorization would probably make it harder for users to try the demo server with various tus clients.

I would consider this PR to be ready for merge in its current form, unless you have any feedback that should be incorporated. What do you think, @smatsson and @felix-schwarz?

@smatsson
Copy link

@Acconut Some minor things:

Clients MAY then use the returned upload URL in the Location header to resume the upload.

To me, this reads as there are other ways to continue the upload. Clients SHOULD use the returned upload URL in the Location header to resume the upload. would be a better fit IMO.

If the Upload-Tag header is invalid in a POST or HEAD request, the Server MUST respond with the 400 Bad Request status.

Would this imply being non-ascii and/or larger than 256 characters? Just want to double check.

I think the security considerations part is spot on and would be easily expanded to work with the challenge extension.

In the examples, should we remove Authorization? It's not used anywhere else in the spec.

Should we update the spec version? I know we are still discussing how to do this but this seems like a good opportunity if we want to increase the minor version?

- Security Consideration: change "requirement" to "recommendation" to bring the statement in line with the rest of the segment
@felix-schwarz
Copy link
Author

Sorry it took me a while to get back to you!

@Acconut Thanks for the feedback and making these changes. I agree with all of them.

In the "Security Considerations" part, I changed "to fulfill this requirement" to "to fulfill this recommendation" to match the rest of the segment.

@smatsson

Clients MAY then use the returned upload URL in the Location header to resume the upload.

To me, this reads as there are other ways to continue the upload. Clients SHOULD use the returned upload URL in the Location header to resume the upload. would be a better fit IMO.

Good point. I changed that accordingly.

If the Upload-Tag header is invalid in a POST or HEAD request, the Server MUST respond with the 400 Bad Request status.

Would this imply being non-ascii and/or larger than 256 characters? Just want to double check.

I think this would include any case where an Upload-Tag is not valid.

In the examples, should we remove Authorization? It's not used anywhere else in the spec.

Even though authorization is only recommended, not required, I'd prefer to leave these in so that the examples show best practice. But removing them would also be ok with me.


@Acconut From my side, the PR is ready to merge in its current form.

@smatsson
Copy link

+1 from me on the above. Ready to merge :)

@smatsson
Copy link

Do we wish to update the protocol version as per the discussion in #165 before merging this?

@Acconut
Copy link
Member

Acconut commented Oct 14, 2020

Thank you for the update, @felix-schwarz! Looking good now!

Only one thing came to my mind: We should restrict the Upload-Tag to printable ASCII characters to exclude control sequences such as the NUL character (see https://theasciicode.com.ar/ascii-control-characters/delete-ascii-code-127.html). I think these are codes 32 to 126. What do you think?

Do we wish to update the protocol version as per the discussion in #165 before merging this?

Not sure, I would like to add a few more things to 1.1 so I am not sure if it's better to bump the version in a separate PR.

@felix-schwarz
Copy link
Author

@Acconut Sorry for taking so long to follow up!

I think that clarification regarding printable ASCII codes is a good idea and have changed it accordingly - referencing Wikipedia (I couldn't find an RFC defining the "printable ASCII codes" term) and the 32-126 range as quick reference.

@Acconut Acconut changed the base branch from master to 1.1 October 26, 2020 17:46
@Acconut
Copy link
Member

Acconut commented Oct 26, 2020

Thank you very much for the final touches! I added the list of allowed ASCII characters, so it's easier to understand what we are talking about. Also, added an exclusion for spaces, to remove edge cases and potential failure.

All in all, this can be merged in the 1.1 branch 🎉

@Acconut Acconut merged commit 9900994 into tus:1.1 Oct 26, 2020
@felix-schwarz
Copy link
Author

@Acconut Thanks for the final touch improvements! Awesome to see it in the 1.1 branch! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants