-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can the bep52 spec clearly explain what happens with non-powers of two. #128
Comments
Hrm, good question. At the very least the index must stay properly aligned to ensure we minimize the amount of proof hashes needed. Allowing non power-of-two lengths at the tail would at least make some spec phrasings and offset calculations in implementations a bit more complicated I think such as finding the proof layers in a @arvidn how does libtorrent handle this today?
Requesting N and replying with N-x doesn't seem sane to me. Either we allow shorter requests or the client should reply with the padding hashes, otherwise it makes offset calculation and matching requests to responses even more complicated. |
This is exactly how proof layers work, right? The expectation from the spec is that I request the full set of proof layers (typically covering the range from piece layer -> the layer which generates the piece root) and the responding client explicitly omits the ones which aren't required. As such the spec guarantees you'll receive between 0 and the requested amount of proof layers. You have to figure it out how many were sent by counting the number of hashes in the response. The invariant being maintained is that you can merkle hash the "hashes" you requested until you generate the "root" of that subtree, and then each proof layer can be merkle hashed, in order, until you end up with the actual piece root. Right? If so, it would be good to clearly call that out in the spec too. It was gonna be my follow up question ;) |
You got a point there. At least it's explicitly documented. I agree, it does strike me as silly that we optimized that part but not the number of layer hashes one can request.
Well, you don't really need to count them since the number of base layer hashes is given in the
Unless you request fewer than necessary because you already have them from other requests. |
Yup! Agree on both points there! I think the docs explain that reasonably clearly - what you described is how I interpreted it. The only actual question then is what should occur when the layer length is not an exact power of 2. I'll see what arvid chips in with :) Thanks for the responses so far! |
@arvidn gentle ping just in case you saw this and forgot! |
Yes, that is valid according to libtorrent's interpretation.
I don't know what you mean by the multiple equal signs here: libtorrent ensures that the range of hashes you request all fall within the tree. You're allowed to request padding hashes, but you don't have to.
Ah, Those would be valid requests. It would also be valid to just request a single hash in the last request. |
These test cases illustrate how libtorrent responds to hash requests: https://github.com/arvidn/libtorrent/blob/RC_2_0/test/test_merkle_tree.cpp#L402 The parameters to |
That could've been clearer - my bad!
Perfect. I proposed a minor wording change to the spec, re-using the same language used to describe how the final piece/blocks are requested. From my understanding of the the tests you linked, and the intent of the spec, this feels appropriate. #129 If you think the spec should be more permissive and clients should be allowed request hashes beyond the length of the layer (i.e. padding hashes), then let me know and I'll update as appropriate. |
bittorrent.org/beps/bep_0052.html
Lines 419 to 423 in 7b7b41f
The spec is not clear about what should happen in the event a bep52 hashed file does not have an exact multiple of 2 number of nodes in a given layer. For example, a file has 7 pieces in it's piece layer. Note: The omitted piece
h
is referred to asp
as it's a padding hash.Is it valid to send a request message with:
index = 0, length = 7
as that is the full layer even though length is not a power of 2 as the spec requires?If not, is it intended that a request message would be sent for
index =0, length = 8
. If so, should this request be fulfilled by the receiving client by including a padding hash as the 8th hash in the Hashes response message, or should the client who receives this only include the 7 actual hashes?Similarly, if this layer were retrieved in smaller chunks, would it be correct to send a request for:
index = 0, length = 4
+index = 4, length = 3
, or should the requesting client sendindex = 0, length = 4
+index = 4, length = 4
?If option 2 is the intention, as a worst case scenario would it be correct (albeit inefficient) to request a layer with 513 nodes using this combination of requests:
index = 0 length = 512
+index = 512, length = 512
. 511 of the hashes would be padding hashes in this case.The text was updated successfully, but these errors were encountered: