Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve caching in GitHub Actions workflow #7990

Open
2 tasks done
hlg opened this issue Feb 10, 2025 · 3 comments
Open
2 tasks done

Improve caching in GitHub Actions workflow #7990

hlg opened this issue Feb 10, 2025 · 3 comments
Labels
documentation Issue concerns the documentation needs help Issue needs help by other contributors

Comments

@hlg
Copy link

hlg commented Feb 10, 2025

Description

I found that the Github action given to publish the site fails with error message Warning: Path Validation Error: Path(s) specified in the action for caching do(es) not exist, hence no cache is being saved. in the post run hook for actions/cache@v4. Currently the path specified to be cached is .cache. However, apparently the home directory is not the place the action is executed and thus the relative path fails to resolve. Further, it might be enough to cache the pip subdirectory. Hence, the path should be changed to include the home folder ~/.cache/pip.

I have tested the suggested change with the BIMserver doc and it can also be confirmed by looking at the examples in the Github Actions and the Cache action documentation.

Related links

Proposed change

see PR #7991

Before submitting

@hlg hlg added the documentation Issue concerns the documentation label Feb 10, 2025
@kamilkrzyskow
Copy link
Collaborator

kamilkrzyskow commented Feb 10, 2025

Hi @hlg,
the goal of this cache is not to cache pip installation, but to store the .cache directory of the plugins. In your case you probably don't run any plugins that generate caches like the Social plugin, so no .cache directory got generated. So your warning was likely a false positive.

You're right that pip should be cached, but I don't know if this is something to be added in the docs 🤔, as caching dependencies is a bit more complex, and most people only install material without plugins, so the installation doesn't take more than a few seconds. [EDIT: Forgot to mention the last part. Creating a cache in this case (of using only material) would add unnecessary compute time, as installing without cache would be faster, and there is also uv, which improves performance even more.]

The "fix" to the CI cache creation is on my radar since quite some time, I know that the update week / day is not optimal, because it should always follow the actual hash of the files. But there can be the case where there are no files, so the cache hash is empty and this created multiple issues in the past, hence this current implementation, which is an improvement from the previous one.
In the past, I've recommended to use separate save and restore actions to circumvent the week / day cycle, however this is not a simple solution for the docs, as in, easy to copy incorrectly 😅

But since then I've dabbled with cache creation when debugging another issue, which brought another idea:

Someone might want to implement it faster than me, so this is the idea from the top of my head:

  1. Revert cache-key creation to only look at .cache directory to get a hash-part with hashFiles('.cache')
  2. This creates an issue where the hash-part of the cache-key is empty because of the lack of files, and the restore-keys prefix becomes a primary-key, which will be always resolved first no matter what more recent cache is created
  3. So the solution is to setup such a cache-key prefix or sufix, so that an empty hash won't lead to generating a primary-level key.
  4. This would allow to always update the cache when needed, but wouldn't break when in case of lack of .cache directory.

Current restore-keys is mkdocs-material-, so my mistake from the past was doing it like this mkdocs-material-${{ hashFiles('.cache') }}, which combined with a lack of .cache resulted in the exact copy of the restore key mkdocs-material- creating the infinite restoration loop 🤦 🤦 🤦

So, the restore-keys should be mkdocs-material without the - at the end, or just mkdocs- without material-, and the generated key should contain the "bridge" part - or material- to prevent empty file hash from creating a primary-key

[EDIT2: The strategy above changes both the prefix and suffix, but mkdocs-material- could stay as the restore-key, and the generated key would have to be something like mkdocs-material-${{ hashFiles('.cache') }}-cache, and I believe this would also work ✌️ ]

A simple fix, right? Well, I need to setup a repository and test it, and haven't found the executive function to do this in the past few months, so somebody else could try it out ✌️

@squidfunk
Copy link
Owner

Thanks for reporting, and thanks for the analysis, @kamilkrzyskow! I'm undersand that this is still an area where we can improve. I'll change the title on this issue to zoom out, so we can focus on improving the situation. I'm currently short on time, so I'd be happy to collaborate on PRs. However, #7991 is not mergable due to the reasons stated by @kamilkrzyskow.

@squidfunk squidfunk changed the title Github action: fix pip cache path Improve caching in GitHub Actions workflow Feb 11, 2025
@squidfunk squidfunk added the needs help Issue needs help by other contributors label Feb 11, 2025
@hlg
Copy link
Author

hlg commented Feb 11, 2025

Indeed, I was not aware of plugins and caching in Material for MkDocs. Thanks for clarification. Maybe a note (or link) about the cache in question would be helpful as addition to the documentation. Otherwise - up to you to close this issue or keep it open for more general cache improvement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Issue concerns the documentation needs help Issue needs help by other contributors
Projects
None yet
Development

No branches or pull requests

3 participants