Allow third-party pull requests to use the CI system #54

FooBarWidget · 2020-08-11T09:21:21Z

The CI system is currently not usable by third party pull requests. The main problem is that Github Actions do not allow third party pull requests' CI runs to access this repository's secrets. Thus, third party CI runs don't have access to any of our infrastructure.

This is intentional behavior on part of Github. Because otherwise, anyone can submit a malicious pull request, which then steals or abuses the secrets.

Currently, all this means is that test publication to Bintray fails. But when #53 is implemented (which makes the CI system publish artifacts to a Google Cloud Storage bucket), things become worse: nearly all CI jobs will become defunct for third-party pull requests.

I can think of three ways to solve this.

Alternative 1: Bring-Your-Own-Infra

Teach contributors how to setup their own infrastructure, and how to hook their pull requests to that.

I don't like this solution because it's more work for contributors, and because it can pollute pull request code with their own infrastructure identifiers.

Alternative 2: Manual approvals

We modify the CI system to not do anything, when it's run from a third-party pull request.
When a third-party pull request comes in, a bot posts a comment, saying that a human will manually review the PR code for whether it modifies the CI to perform anything malicious.
If no maliciousness found, the human signals the fact that nothing malicious is found, by posting a comment containing a bot command.
The bot pulls the pull request into a temporary branch on the official repository. Github then kicks off a CI run for that temporary branch. The bot posts a comment, saying that a CI run has started, linking to the CI run page.
When the CI run is finished, the bot posts a comment, reporting the CI results.
If the original creator of the pull request updates the pull request, then the human reviews the changes again for maliciousness, and we go back to step 3.

Alternative 3: Public infrastructure for third-party pull requests

Introduce new infrastructure which is publicly accessible, and which lives in parallel to the existing infrastructure. This new infrastructure exists for the sole reason of allowing third-party CI runs. Thus, a few things are important:

Clear usage limits must be set.
Data must be periodically wiped.
Access should be logged.

Here's how it works:

The CI no longer accesses Bintray directly. And when #53 is implemented, it won't access Google Cloud Storage directly either. Instead, we introduce an API server. The CI accesses all infrastructure resources through this API server instead.

When the CI is run from the official repository, we authenticate with the API server through an official password. The API server then accesses the main infrastructure.

When the CI is run from a third-party pull request, no authentication with the API server occurs. The API server then accesses the public infrastructure.

Problems with this approach:

Because it's publicly accessible, everybody could mess with everybody else's data. Since we don't store any important, non-recoverable data on this infrastructure anyway, I think this issue is acceptable. The only risk is that anyone could potentially perform a Denial-of-Service on this infrastructure, either by constantly deleting everything, or by filling it up to its limits.
We need a second Bintray account. That's going to be an issue. Or we should move away from Bintray.

Conclusion

I don't like alternative 1. Alternative 3 sounds like a lot of work. So I think alternative 2 is the best.

FooBarWidget · 2020-11-05T13:02:19Z

Here's a specification of alternative 2's behavior.

Diagram source: reviewbot.drawio.zip (made with Diagrams.net)

The main thing I added compared to the textual description above, is the concept of a review ID, in order to avoid race conditions. Each time code is pushed, we generate a unique ID. When a developer approves, s/he has to specify that same ID.

Imagine this:

Someone opens a PR.
A developer reviews and approves. But right before approval is submitted, the submitter pushes more code.

Without a review ID, the bot would think that the developer also approved the second push, even though the second push needs to be reviewed.

The existence of such a review ID means that the bot needs to store state.

FooBarWidget added the ci/cd Issue related to the CI/CD system label Aug 11, 2020

FooBarWidget added this to the Epic 4: Improve contributor friendliness milestone Aug 11, 2020

FooBarWidget changed the title ~~CI/CD system should publish new build environment images~~ Third-party pull requests should be able to use the CI system Aug 11, 2020

FooBarWidget changed the title ~~Third-party pull requests should be able to use the CI system~~ Allow third-party pull requests to use the CI system Aug 11, 2020

FooBarWidget added the help wanted Extra attention is needed label Aug 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow third-party pull requests to use the CI system #54

Allow third-party pull requests to use the CI system #54

FooBarWidget commented Aug 11, 2020 •

edited

Loading

FooBarWidget commented Nov 5, 2020 •

edited

Loading

Allow third-party pull requests to use the CI system #54

Allow third-party pull requests to use the CI system #54

Comments

FooBarWidget commented Aug 11, 2020 • edited Loading

Alternative 1: Bring-Your-Own-Infra

Alternative 2: Manual approvals

Alternative 3: Public infrastructure for third-party pull requests

Conclusion

FooBarWidget commented Nov 5, 2020 • edited Loading

FooBarWidget commented Aug 11, 2020 •

edited

Loading

FooBarWidget commented Nov 5, 2020 •

edited

Loading