-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add extra API for offline/online phases in ECDSA signing #1610
Comments
Indeed, and this is the drawback of this. It's a major footgun, and I'm sure this is the reason why it's currently implemented this way. Personally, I'm a bit on the fence. We're about to add a MuSig2 module, which is also prone to nonce-reuse failures -- but in that case, it is unavoidable to split nonce generation and the actual signing into two separate steps. In the end, I don't think I'm against giving users advanced and dangerous features, but they should be clearly marked as such. I'd love to hear your use case and scenario here. Noone has asked for this before, and I doubt many users will benefit. (And anyway, ECDSA can be considered somewhat deprecated in Bitcoin, but okay, the same could be done for Schnorr signing). Do you personally need/want this, or is this is just a suggestion for improvement? It would be interesting to know what other contributors and maintainers think. |
I think we have ways to make this less of a footgun. For example, here goes a simple mitigation. We can add to the precomputed material data struct a flag
Like this, a user has to go out of their way to use this unsafely (by modifying fields in an opaque struct). This doesn't cover the contrived case where the user is reusing precomputed material that got persisted before the flag change; but I think that's a narrow corner case that can be dealt with a clear warning: do not reuse nonce material if you're using this API. I imagine you've considered similar things for MuSig2, curious what you think.
I think this is the right tradeoff. We can clearly mark the precomputed material as secret and that it cannot be re-used.
I'm asking others to chime in and give better context. Long story short is that we're hitting a performance bottleneck in Bitkey hardware wallet (https://github.com/proto-at-block/bitkey). We need to speed up (by a lot) the signing operation for large transactions if we want to use libsecp256k1. |
I don't think it's contrived. What if you run inside a VM and the VM is reset? What if the caller decides to memcpy the nonce, or write it to disk? (We can document that one must not do this, but this doesn't necessarily mean that people will adhere to the rules.) The second large concern is that one cannot use deterministic nonce generation when the message is not known. So you'll need a good source of randomness during signing. But yeah, we're apparently willing to accept that risk in the MuSig2 API, so it's not entirely crazy to add a similar API also for ordinary signatures.
In the MuSig2 PR, we simply overwrite the nonce with zeros after using it. That's a tiny bit safer and doesn't need a flag.
I'm curious why you use ECDSA at all in a modern wallet. Schnorr signing is 27% faster on my machine, I'd be curious about the difference on the hardware wallet. The difference is probably mostly due to the RFC6979 hashing in ECDSA, which is overkill. So switching to a faster nonce derivation function should already speed up things a lot without introducing risks. Also, have you tried with a recent version (>=0.5.0), and have you tried all precomputation table sizes on your platform? (See https://github.com/bitcoin-core/secp256k1/blob/master/CHANGELOG.md#changed-1, note that we changed the default in 0.5.1) |
Thanks for the great discussion.
I agree rollback is bad. I think in the VM case, one would not use the new proposed API. One would just use the existing API. The new proposed API is suitable for a specific use case we're hitting "in real life": an embedded device (=constrained computing device) that a) needs to sign "very fast" and b) has plenty of time for precomputation before the signing request comes in.
We cannot use RFC6979 to generate the secret nonce when the message is not known. But that doesn't mean we can use other approaches for deterministic nonce generation (see next paragraph).
Yes, that is one way, but not the only one. Alternatively, one can use a monotonic counter / unique (public) value as input to a keyed hash to construct the secret nonce. I'm not necessarily advocating for this; it all depends on the specific use case and definitely we're in the territory of "you need to know what you're doing". I think the new API should be generic enough to allow the user to do this kind of choices.
Sounds reasonable, thanks.
Sure, that would also work. Happy to take this approach too.
I'll let others chime in, I don't have the numbers right now with me. Thanks for all the suggestions! |
Hopping in here -- this would be useful for Bitkey's firmware, which does transaction signing over NFC using We're considering using our chip's hardware accelerated ECDSA, but we prefer
Not all exchanges support sending to bech32m, so it's a bit difficult to eschew ECDSA completely. On our chip, ECDSA takes ~60ms and Schnorr takes ~55ms. We can't use large precomputation tables due to limited RAM. I do think the optimization being proposed here is mostly useful for resource constrained embedded systems, but it could be very useful in those settings. |
I think it's reasonable to expose such an API for both ECDSA and Schnorr. I would suggest doing it in much the same way that we expose custom nonce functions -- provide a second API that takes a callback and auxiliary data, whose role it is to produce a nonce keypair and inverted nonce. This would be symmetric with our existing ability to override the RFC6979, which is similarly niche and dangerous. (Though the "typical" use of overriding the nonce would be to use a faster PRNG, which is perfectly safe, while the "typical" use of overriding the nonce computation would be to use precomputed nonce values, which is fraught.) I would not suggest exposing an easy-to-use or first-class API for this. It's much more difficult to obtain a monotonic counter (or strong RNG) and to ensure your software is never forked or reset, than it is to use an awkward C API. So if we were to provide a friendly API, we'd just be making the "easy" part easier and potentially encourage people to use it without carefully considering whether they can do so safely. |
@jmecom Sorry for the slight tangent here, but:
Since #1058 landed in 0.5.0, the supported table sizes (for the key generation/signing code) are 2 KiB, 22 KiB, and 86 KiB. How much is acceptable to you? It's easy to add more configurations if there is a need for them. |
I just pulled in 0.5.1 and experimented with this. I didn't see a huge impact on the table size with a window size of two:
Window size of five took about 50ms @ 22kb. So, I don't think more fine-grained options are really necessary for our usecase at least. |
@jmecom By "window size", you mean the |
Ah, woops. Due to the way I'm profiling, there is a little variation across each run, so I mistakenly thought that lowered the duration, but it was just an anomaly. Ignore that, but the rest is accurate. |
Here goes an sketch of the proposed changes. Let me know what you think:
Some points for discussion:
I hope the API above follows this advice. |
The above proposal sounds reasonable to me (other than the name -- I agree "precomputednonce" or "detachednonce" sound better). But it does open a question about whether we would want an "off by default" module. Currently we have two categories of modules:
Originally I think we had imagined that users would be able to pick and choose what modules they need, but that's not really compatible with shared library distribution; and we also hoped that users would be able to write their own custom modules, but that doesn't really work with C tooling (though I suppose it could be made to work with nix stdenv and the Anyway, this module would presumably be "experimental", but unlike the other experimental modules, I don't think we would ever be willing to promote it to a "real" module. |
Thanks! I like "precomputednonce" better.
That sounds reasonable. Sounds like it would be better to mark this module as |
@real-or-random @sipa : I'd love to get some feedback before writing code. Do you think the proposal in #1610 (comment) is reasonable? Should we discuss a specific point? Would you be OK if I put up a PR adding this API? From #1610 (comment) it looks like this proposal is reasonable to @apoelstra. |
We'd love to split the ECDSA signing operation into two steps:
The advantage here is that the offline step can be precomputed arbitrarily early and do "most of the computational work". This is very useful for signers that have a lot of idle time before the signing request comes in, but little time to compute and return the signature. For example, hardware wallets.
In ECDSA, the offline step can essentially compute nonce generation + scalar multiplication + nonce modular inversion. These values ("precomputed material") are input to the online part. The online part just finishes the ECDSA computation. The online part is extremely fast (we're seeing around 5000x faster than the offline phase in a non-libsecp256k1 proof-of-concept).
The precomputed material is secret, can only be used once and needs to be wiped after usage.
API-wise this would probably mean adding two functions:
secp256k1_ecdsa_sign_split_phase_precompute()
secp256k1_ecdsa_sign_split_phase_online()
The composition of the two functions compute
secp256k1_ecdsa_sign()
. The input/output behavior does not change.Is there any appetite for this? Happy to write the code for this.
The text was updated successfully, but these errors were encountered: