Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bring pattern matching more closely in line with Git #17

Open
bk2204 opened this issue Apr 29, 2020 · 6 comments
Open

Bring pattern matching more closely in line with Git #17

bk2204 opened this issue Apr 29, 2020 · 6 comments

Comments

@bk2204
Copy link
Member

bk2204 commented Apr 29, 2020

In general, our pattern matching in this library doesn't match Git's behavior. We should overhaul our pattern matching such that it does indeed match, ensure that we have options for both gitignore and gitattributes, and add sufficient tests.

One approach that may be interesting is to convert to a regex-based implementation, although a less dramatic overhaul preserving the current code is also a possibility.

@chrisd8088
Copy link
Member

Indeed -- see git-lfs/git-lfs#4051, I think, for a relevant issue.

@ttaylorr
Copy link
Contributor

In general, our pattern matching in this library doesn't match Git's behavior.

I'm not sure that I agree. Special care was taken to port the tests from upstream's t0370 (https://github.com/git/git/blob/master/t/t3070-wildmatch.sh). There are a handful of cases I think I remember deciding not to support early on, but our tests in this library come from there.

One approach that may be interesting is to convert to a regex-based implementation

This is basically what the code was before we extracted it out and rewrote it here. It was a nightmare to maintain, and was far more brittle and non-compliant than the code is today.

@bk2204
Copy link
Member Author

bk2204 commented Apr 29, 2020

We definitely have some odd behavior around wildcards. I don't have an example testcase off the top of my head (I thought I did in my stash, but maybe that's in macOS), but there are things which don't match an expected pattern when you have multiple single wildcards in the path segment and the match is potentially ambiguous. I'll have to check on the particulars, but there are a few edge cases we don't handle properly that have bitten folks.

I think we also have some incompatibilities between our gitattributes and gitignore parsing, where we don't exactly match either behavior.

And we also have #14 as well, which is another fun wildcard issue.

It may be that these are relatively easy to fix, but I definitely think we need more testcases here. I'll try to come up with some that demonstrate the problem. I should have filed issues about those mismatches earlier instead of letting them accumulate in my head (or my stash). I had intended to come back and fix them, but never got a chance to.

Ideally, I'd like to fix these for 2.12 and then try to drop the special-casing that we have in Git LFS.

@ttaylorr
Copy link
Contributor

We definitely have some odd behavior around wildcards.

Some of these are probably unintentional bugs that I wrote when trying to match all of these test-cases, but the remainder are compatibility patches to retain old behavior from the old filepathfilter package. I always intended to drop that special casing when we got to LFS 3.0, but it may be prudent to do it sooner than that.

I think we also have some incompatibilities between our gitattributes and gitignore parsing, where we don't exactly match either behavior.

That's definitely a deficiency in git-lfs/wildmatch. I would love to see some patches to better handle the differences between these two cases.

Ideally, I'd like to fix these for 2.12 and then try to drop the special-casing that we have in Git LFS.

I think that's a smart direction to go in. I just want to caution you against the original title of this issue, which is to "overhaul" wildmatch entirely. I was tempted by this in early 2018 when I wrote this library, and it ended up being a much larger undertaking than I had originally planned. If I can save you some weeks of development effort when it would be easier to add patches on top of what is already written and documented here, I would be glad to do so.

@bk2204 bk2204 changed the title Overhaul pattern matching Bring pattern matching more closely in line with Git Apr 29, 2020
@bk2204
Copy link
Member Author

bk2204 commented Apr 29, 2020

I'm happy to take advice on what not to do. My intent with "overhaul" was basically "bring in line with Git", and if we can do that by making small changes, I agree that's the smart choice. There's no point in major changes if we like what we have and can make small ones.

I think bringing the compatibility into line with Git can be qualified as a bug fix (at least considering the number of issues we've seen about the incompatibiltiy), and so need not be a major version bump. Of course, if folks feel differently, I'm open to hearing that.

@ttaylorr
Copy link
Contributor

ttaylorr commented Apr 29, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants