Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding OpenSSL CLA requirement #113

Open
planetf1 opened this issue Nov 6, 2024 · 13 comments
Open

Adding OpenSSL CLA requirement #113

planetf1 opened this issue Nov 6, 2024 · 13 comments

Comments

@planetf1
Copy link
Contributor

planetf1 commented Nov 6, 2024

One of the potential projects consuming code from liboqs, and pq-code-package is OpenSSL.

In order for any code to be considered by OpenSSL, all contributors (current, future, and historical) are required to have signed the OpenSSL CCA. This is very similar to the Apache CLA. This will apply both at the individual contributor and organization contributor level.

Note that this in no way means it's certain OpenSSL will actually use our code. They are still evaluating the approach they wish to make. This is merely a prereq for consideration.

This discussion has been ongoing at the PQCA level both in terms of general approach ,as well as tooling to support.

As a TSC we need to consider if we are ok with this approach, whether we have any particular concerns, and provide feedback to the PQCA.

Suggest discussion at the 20241106 TSC meeting

@planetf1
Copy link
Contributor Author

planetf1 commented Nov 7, 2024

OpenSSL CLA

@mkannwischer
Copy link
Contributor

It would be great to have mlkem-native in OpenSSL.

Two points:

  1. @potsrevennil and me are happy to sign the CLA for mlkem-native and I don't see any issues with our employer signing as well. @hanno-becker and @rod-chapman, how do you think about this?

  2. Is see that openssl recently merged ML-KEM-768 from Boringssl in ML-KEM768 openssl/openssl#25848 into a feature branch. It would be good to understand how much they are looking for more parameter sets, more optimized code, or more assurance. Maybe @baentsch or @dstebila do have some more information on this that they could share with us, please?

@baentsch
Copy link

Thanks for tagging me @mkannwischer . Sure I can give some background: These discussions started before the establishment of PQCA (see openssl/project#431 for "history" and rationale). When they then didn't really progress, as a long-time OpenSSL contributor, I more recently chose to re-prioritize my non research-community-targeting FOSS contributions to that project where we currently proceed implementing all standards in a way completely separate from PQCA for various reasons, (missing) CLAs being one. fwiw, SLH-DSA is the most mature, but work is proceeding on the other algs as well, incl. ML-KEM, see e.g., openssl/openssl#26006.

That said, as I am an independent contributor myself, I personally love to work with any other independent contributor on the same terms of any given FOSS project. So if indeed in this case all authors (and where applicable, their employers) of an ML-KEM implementation are also willing to contribute under the same terms, let's discuss concrete ways forward, ideally in an open forum like openssl GH. While I by now also appreciate the boringSSL APIs, there's (as always) room for improvement, so you may consider responding to openssl/openssl#25879 for example as and when you'd be willing/able/permitted to contribute and make OpenSSL better. Another approach may be for you to open further issues at OpenSSL explicitly targeting functional and non-functional benefits of alternative code bases. That way, the whole OpenSSL community can chime in as to merit and priority of such features. In this case, please be sure to use the tag "ML-KEM" in the issue header so they're easier to find in the "sea of issues" :-) Tagging me also would help. Or even more simply, just open a PR -- in the case of ML-KEM, be sure to target the feature branch of that name for now. Thanks in advance!

@hanno-becker
Copy link
Contributor

hanno-becker commented Nov 25, 2024

The CLA would not be an issue from my employer's end either.

@mkannwischer
Copy link
Contributor

Thanks for tagging me @mkannwischer . Sure I can give some background: These discussions started before the establishment of PQCA (see openssl/project#431 for "history" and rationale). When they then didn't really progress, as a long-time OpenSSL contributor, I more recently chose to re-prioritize my non research-community-targeting FOSS contributions to that project where we currently proceed implementing all standards in a way completely separate from PQCA for various reasons, (missing) CLAs being one. fwiw, SLH-DSA is the most mature, but work is proceeding on the other algs as well, incl. ML-KEM, see e.g., openssl/openssl#26006.

That said, as I am an independent contributor myself, I personally love to work with any other independent contributor on the same terms of any given FOSS project. So if indeed in this case all authors (and where applicable, their employers) of an ML-KEM implementation are also willing to contribute under the same terms, let's discuss concrete ways forward, ideally in an open forum like openssl GH. While I by now also appreciate the boringSSL APIs, there's (as always) room for improvement, so you may consider responding to openssl/openssl#25879 for example as and when you'd be willing/able/permitted to contribute and make OpenSSL better. Another approach may be for you to open further issues at OpenSSL explicitly targeting functional and non-functional benefits of alternative code bases. That way, the whole OpenSSL community can chime in as to merit and priority of such features. In this case, please be sure to use the tag "ML-KEM" in the issue header so they're easier to find in the "sea of issues" :-) Tagging me also would help. Or even more simply, just open a PR -- in the case of ML-KEM, be sure to target the feature branch of that name for now. Thanks in advance!

Thank you very much for all that context @baentsch! It would be great to contribute.
I've posted a comment to openssl/openssl#26006. Let's move the discussion there.

@paulidale
Copy link

In order for any code to be considered by OpenSSL, all contributors (current, future, and historical) are required to have signed the OpenSSL CCA. This is very similar to the Apache CLA. This will apply both at the individual contributor and organization contributor level.

Some clarifications/notes:

  1. The project uses the Apache version 2 CLA (the only change is the project name from memory).
  2. Only contributors & their employers at the time of integration need to sign CLAs (i.e. future contributors don't need to unless their changes are pulled across).

@hartm
Copy link

hartm commented Nov 28, 2024

I've gone over this with our lawyers over the past couple of months. We can definitely make this work from the Linux Foundation perspective, provided that all of the contributors are willing to sign the CLA (and have their employers sign it if applicable). Note that this requirement would include all past contributors, too.

There are still a couple of things that need to be sorted out for this to work.

  1. Do you all want everyone in the entire PQCP to sign the CLA, or only have the requirement for certain repos? This is really up to you to decide. We can make it work either way.

  2. We need to figure out how to handle enforcement. Unfortunately, EasyCLA does not have the kind of cross-org support we need for this. So, we will probably have to do some kind of manual enforcement. The PQCP maintainers will need to only approve contributions from folks who have signed the CLA. We can make this a list and configure github to only accept contributors from this list, but it will still require communication with the OpenSSL team to ensure that folks have actually signed the CLA. Then, if they want to pull in code from PQCP, the OpenSSL folks can just check the contributors' list against their CLA database to be sure. Hopefully this will be not too much work for everyone involved.

So, the next steps should be as follows:

  1. Go ahead and start signing the OpenSSL CLA. It may take time in some cases to get companies to sign, so I'd encourage you all to do this soon. I'd also suggest starting to try to track down folks you know have contributed; it may be hard to reach some people (but hopefully not).

  2. Figure out a policy for the PQCP and CLA signing. This could be requirements for certain repos or the whole org. The PQCP TSC should pass (and clearly document) a rule on this.

  3. Coordinate with the OpenSSL folks on enforcement. The more we can automate, the better, but there will likely be some things that still have to be manually checked.

Does all of this make sense to everybody? Please let me know if there are any questions. Thanks for your time!

@planetf1
Copy link
Contributor Author

planetf1 commented Dec 1, 2024

Thanks @hartm .

Just my thoughts (I think we should have finally agreement via a tsc vote)

  1. Simpler in some ways if applies to whole org, however I'd be ok if we focussed just on any repos where it's required - as we can scope the contributors tightly. It then would also not prevent us taking on a new project where the cla wasn't appropriate if we wished to
  2. automated is obviously preferable. At a min we should update the PR template/project contribution guide

We have a TSC this week - suggest we use some time in the meeting to talk about the practicalities.

@planetf1
Copy link
Contributor Author

We discussed this at the 20250116 meeting. We'll continue today. For now the focus is on mlkem-native. Any CLAs will need to include contributors to the original reference implementations upon which mlkem-native was based.

@mkannwischer
Copy link
Contributor

We discussed this at the 20250116 meeting. We'll continue today. For now the focus is on mlkem-native. Any CLAs will need to include contributors to the original reference implementations upon which mlkem-native was based.

I have a small update on this: I have been in touch with the Kyber team regarding CLAs for the pqcrystals implementations and initial feedback was quite positive. As a first step, we are trying to get signatures from everyone listed in https://github.com/pq-crystals/kyber/blob/main/AUTHORS. I don't know how long that will take.

@baentsch
Copy link

Thank you very much for all that context @baentsch! It would be great to contribute.

As I see you revive(d) this thread and as quite some time has passed since the above, allow me to give an update/heads-up only as an "interested bystander" (definitely not to be understood as any "projects' official opinion" but just my personal thoughts):

IMO, the OpenSSL ML-KEM implementation has matured massively by now and I personally think an equally massive functional and/or non-functional delta would have to be present(ed) to update it via PR.

In order to provide a rough estimation of work effort (required on your side as well as the reviewers'), I'd like to share some code sizes/relations that I consider pretty representative/tell-tale: The size of the diff between feature/ml-kem and master is 28371 (LoC) of which less than 10% (2510) are attributed to the actual crypto/ml_kem code, 3885 to the necessary provider integration, 16380 to tests and the rest to "ancillary" changes required to add proper support for ML-KEM (sizes determined by running git diff master <subtree> | wc on the feature/ml-kem branch).

Why do I mention this to you, @hanno-becker @mkannwischer ? IMO, what "mlkem-native" currently has in terms of functionality equates (roughly) to the 2510 LoC above. What you would absolutely have to add to raise a PR trying to propose your code to OpenSSL is the equivalent of the 3885 LoC above, i.e., much more than what you already have (assuming all tests pass immediately). Quite a bit of that is boilerplate, but doing it requires lots of internal OpenSSL know how that requires quite an effort to acquire (and I won't claim to have it).

Complicating things more, I do not think the "mlkem-native" APIs are currently suitable to support the OpenSSL breadth of options e.g. outlined here. Worse, IMO the integration of your code to liboqs does not help this integration effort in any way as the APIs there also currently are not "wide" enough (extended vs seed key representation). In addition, the strict separation of key representation code between liboqs and oqsprovider (that would allow operations in OpenSSL) plus the lacking quality and functionality in the latter doesn't help deliver the "missing 3885 LoC" -- and that is a pretty well-founded expert assessment for a change.

This is not to discourage you from pursuing an integration of your code to OpenSSL, but only meant as a warning that you have way more than half the way to go in terms of actual work (in my personal opinion, you have walked maybe 10% of the journey) -- and this is completely independent of the CLA issue you'd need to sort in addition as per this issue as well as the effort that would be required to review your contribution (pretty much the same again that already went into the current code in feature/ml-kem).

Again, all of this is just my personal opinion meant to "shine some light on the path ahead" as someone who has traveled some parts of it in the past 4 years. As usual, I may be off, but at least no-one can claim I didn't share what I think I know. Have a good weekend.

@hanno-becker
Copy link
Contributor

@baentsch, thanks for your thoughts -- your experience working on OpenSSL is very valuable here. Actually, allow us to get straight to it: Would you like to work with us (me and @mkannwischer) on the integration of mlkem-native into OpenSSL, and the (re-)shaping is needed on mlkem-native and libOQS to make that happen? I think we'd stand a much better chance if we had your hands-on support here.

@baentsch
Copy link

baentsch commented Feb 2, 2025

Simple answer: No.

In case you have interest and time for more background, read on.

and the (re-)shaping is needed on mlkem-native and libOQS to make that happen

I think --and I keep repeating saying this since a long time-- that an integration of mlkem-native to liboqs does not bring any advantage to the goal of an

integration of mlkem-native into OpenSSL

For more detailed arguments, see e.g. here and here.

FWIW&FYI, my personal motivation in all of this is to help improve the PQC software ecosystem both in an efficient manner (as I'm doing this unlike everyone else in PQCA completely voluntary and unpaid) and one beneficial to end users (not corporate "FOSS free riders" or "PQC marketing"): I currently cannot see how an integration of mlkem-native with OpenSSL (let alone via liboqs) could align with either of these two personal goals. Moreover, there's many additional external issues that has me think of a fight against windmills...

If you'd like to discuss directly (as well as exchange more "background thoughts" too long to write down), please feel free to send me an email to set up a direct conversation, @hanno-becker @mkannwischer .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: TSC Discussion
Development

No branches or pull requests

6 participants