[rbe] Local registry: build config diff ids from layers #15
+27
−8
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The local registry gets the image config as input, which contains the hashes of all the layers inside of the image.
We were simply returning it but it turns out the hashes of the layers in the config does not always match the actual layers.
This is caused by a weird interaction in Databricks internal rules where we have a docker layer as a parent of a docker base like
docker_layer -> docker_base -> docker_layer
.Essentially, if the docker layer changes, we are required to rebuild the docker_base but there is nothing enforcing that today (see https://github.com/databricks-eng/universe/blob/bf55a02cb20c5bb72992a4bb401212bea69f9397/bazel/rules/docker.bzl#L2051).
When that happens, we get errors like the ones seen on:
https://runbot-ci.cloud.databricks.com/build/Unit-Compile-Pr/run-logs/45377642
So to workaround this we make the behaviour of the loader_tool to be the same as the original incremental loader: we overwrite the hashes in the config using the values computed from the actual layers:
rules_docker/container/incremental_load.sh.tpl
Line 75 in da6dc62