-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concurrent git commands on the same repo fail #211
Comments
git init
commands on the same repo fail
Use it in the appropriate places in the tests
git init
commands on the same repo fail
This issue has been automatically marked as stale because it has not had any recent activity. It will be closed if no further activity occurs. Thank you for your contributions 🙏. |
We saw this again today when testing #545 |
OK I've caught this today with a new acceptance test. You can reproduce it with:
Here's the error we see:
|
It's interesting that it's using |
Relates-to #211 It's debatable whether this deserves to be surfaced in an acceptance test, but I'm not sure yet which layer is the right place to solve this, so it makes sense to put it in our full-stack tests in any case. And it's probably useful to document this capabilty.
Moved this into a branch: 211-support-concurrent-git-commands so we don't have too many WIP scenarios in the main branch. |
I've resolved this for now by returning a 503 error with a 60-second We could also do more proactive exploratory testing on it. I can imagine that there could be other errors coming from git than the ones it currently handles if we try doing several things at the same time, such as making a commit during a fetch, or fetching during a push. |
I've had a think about the options for resolving this "properly". Largely, this comes down to choices about which layer we want to handle this problem at. Is it something for the local-clones git adapter to handle, or should the domain be aware that a repo could be busy? Or should the application itself be responsible for queuing commands for each repo so only one can happen at the same time. One thing I'm pretty sure of is that we should avoid trying to solve this by bending the bullmq infrastructure to use one queue per repo. We tried this before (see ADR 0011) and it didn't work out well. My hunch is that the problem could be solved by domain events, once we've implemented #242 so that any node can see all the events. |
Ref: #211 Co-authored by: Romain Gérard <[email protected]>
I wonder whether this is another instance of this same problem:
|
We're doing #572 first, which should cut down a lot of the problems by only doing fetches when absolutely neccesary. |
I have been thinking about this today, inspired by the GitHub Desktop blog post shared by @romaingweb on Slack. If we want to get an exclusive lock on a Repo before working with it, I think the right place to do that would be in the We can change this interface to offer either an |
This issue has been automatically marked as stale because it has not had any recent activity. It will be closed if no further activity occurs. Thank you for your contributions 🙏. |
We could use something like https://www.npmjs.com/package/redis-semaphore to hold a mutex on the repo while working on it. |
The concurrency problems does not happen anymore since only one container is used for workers. The problem will need to be resolved when more containers will be needed. An ADR has been written in 382a2d0. |
Co-authored-by: Julien Biezemans <[email protected]> Co-authored-by: cbohiptest <[email protected]>
Related to #210, if you run a manual fetch immediately after a connect, you'll see a failure because the background fetch triggered by the domain rule seems to be clashing with the manual fetch.
The actual error I see is this:
So this looks like at the moment it's just caused the config commands we run in
handleInit
which we're blindly running each time you open a repo folder. We can probably avoid this, but there might be other problems with the other git commands when we get past this. Needs more investigation.The text was updated successfully, but these errors were encountered: