Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(cone-mode): sparse checkout cone mode support #1497

Open
wants to merge 32 commits into
base: master
Choose a base branch
from

Conversation

lmmx
Copy link
Contributor

@lmmx lmmx commented Feb 17, 2025

Following up on my previous submission #1495 (from #405) which laid the groundwork for cone mode I wanted to complete the task and now have got the test cases all working.

It's once again pretty late by the time I'm finishing up, so I'll review with fresh eyes but I've at least ran all the linters this time :-) Tests pass and are named normally.

@lmmx lmmx requested a review from jelmer as a code owner February 17, 2025 02:50
@lmmx
Copy link
Contributor Author

lmmx commented Feb 17, 2025

Ooh OK docstring parse errors, I'll return tomorrow 🙏

Copy link
Owner

@jelmer jelmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, great to see cone support!

Left a bunch of comments inline.

@@ -0,0 +1,462 @@
# sparse_patterns.py -- Sparse checkout pattern handling.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add tests for each of the functions/classes in this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, this one is nice and simple to test as the file is just a readable list of patterns, will do tests next

@lmmx
Copy link
Contributor Author

lmmx commented Feb 18, 2025

Super, thanks will address ! 😄

@lmmx
Copy link
Contributor Author

lmmx commented Feb 19, 2025

Whew, okay all of those comments were tidying/refactoring tasks that are all done now. Also I'd somehow managed to misread the meaning of "un-exclude", what the docs mean there is that there is always the following base paths:

/*
!/*/

i.e. "include /*" (include all top level files) and "exclude /*/ (exclude all top level dirs)

In other words it says "don't include any dirs until told to do so"

So "un-exclude" is just a confusing way to say "include directories when told to do so".

To be super clear, as far as I know you never put excludes on except in that one line as a special case. This is why cone mode is simpler, because you are only dealing with the one type of pattern.

That should make it simple to test :-)

@lmmx
Copy link
Contributor Author

lmmx commented Feb 19, 2025

existing tests are now all passing again

@lmmx
Copy link
Contributor Author

lmmx commented Feb 20, 2025

Tests are almost all passing, will revisit and see if there are logic bugs or test bugs 🐛 🔨

@jelmer
Copy link
Owner

jelmer commented Feb 20, 2025

🎉 Looks great overall - I'll take a closer look when I have some more time to do a full review.

@lmmx
Copy link
Contributor Author

lmmx commented Feb 20, 2025

Whew okay all tests passing now, so that gives me confidence this is a functioning implementation of cone mode 😄

The one last tricky bit was ensuring that “patterns suffixed with a slash (meaning they are dirs) must only match directories but can match files in subpaths of the directory pattern”. My initial implementation had just said “let’s just treat these as only the directory and not deal with the subpaths for now”.

I’ll be able to go over this again and refactor the logic here now that the functionality is pinned down by tests.

@lmmx
Copy link
Contributor Author

lmmx commented Feb 20, 2025

I’d also point out (and have just edited the docstrings accordingly) that this is (still!) not the full implementation of git’s sparse checkout with all features, namely it is only the recursive cone mode, not the parent cone mode. I.e. when you say “add directory foo to the cone” you get all subdirs of foo, not just the top one. The recursive mode would be more work and I think it’s a perfectly fine MVP to introduce just this, and ensure we’re happy with this before any further round. PRs typically go wrong when you change course midway and this is looking stable here.

The distinction is mentioned in this blog post: https://github.blog/open-source/git/bring-your-monorepo-down-to-size-with-sparse-checkout/

IMG_2628

elif not matched:
# Not anchored: we can do a simple wildcard match or a substring match.
# For simplicity, let's use Python's fnmatch:
matched = fnmatch(path_str, pattern) or fnmatch(path_str, f"*/{pattern}")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason this can't use dulwich.ignore.Pattern instead, since fnmatch doesn't exactly match?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahhh have I reinvented a wheel 🤦‍♂️ I will try to edit it to use that then, my bad!

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Otherwise LGTM, thanks for adding tests and refactoring.

(I appreciate this isn't complete for sparse checkout support, but it's a big step towards fully supporting it so that's great!)

@jelmer
Copy link
Owner

jelmer commented Feb 22, 2025

Please also take a look at the failing tests - you probably want to compare sets rather than sequences?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants