-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More robust reading of filenames #20
Conversation
Thanks a lot, I'm curious about error conditions:
Should these error out? Or is such a filename legal? I'd be happy to have them as part of the test suite (and I'm fine to consider them illegal and error out). I'm asking since you clearly have more of the diff file format in your head than me. Thanks again. |
Thanks for the edge cases. Coming up with these is sometimes tricky :-)
|
Thanks for your additional commits, what I had in mind is something with a starting quote:
And that fails at the moment with:
EDIT: I'm happy to fail in this case, but the parser dropping parts of the data and applying the wrong diff is not super-nice. |
True, there's an additional error case when the file starts with quotes. I've updated the tests so the destination filename starts with a quote. I have a feeling that it might not be possible to read the filenames with spaces and quotes with just the lexer reading a string, given busybox and git diff do the opposite things when they encounter quotes and spaces (git diff quotes spaces and escapes quotes, busybox always uses literal quotes). Might need to properly tokenize and postprocess the outcome. |
Dear @Leonidas-from-XIV,
Thanks. I added some comments on what I'd expect the outcome to be.
I have no file with a If you strive for supporting both git diff and busybox, please also feel free to do so :) My intuition is only to "better fail than use incomplete data" -- i.e. in the |
Generally I totally agree but I didn't want to add an unrelated change to the behavior to this PR which is just about the parsing of filenames.
|
402ea35
to
ed42c1c
Compare
Making it pass There is still edge case: if the file is called |
Thanks a lot for this work! I have several ideas on how to improve this PR, i'll take over this work if that's ok with you, and i hope to be done on Monday or Tuesday night if all goes well |
I think this is ready to review. Don't mind the commit history, i'll squash and clean things before merging. I strongly believe this supports everything GNU diff can throw at it and every other diff. Some OS have diffs that can output files that cannot be parsed by their own patch utility (e.g. if a filename contains a tab or a newline on any diff utility other than GNU or git it's pretty much game over) and for that we can't do anything. There is still the ambiguity between Once this PR is merged i'm hoping to:
What do you think @hannesm? |
Dear Kate (and Marek), I'm very happy of the work you're doing on this package. Incredibly happy. I briefly looked at the API change of this PR, and am mostly fine with it. The minor thing I find slightly annoying is the "exception Error" - I'd appreciate something that is more self-explanatory (i.e. 'exception Filename_parse_error of string' -- to include the "offending" data -- so if someone hits it, we easily get that information.
Awesome!
Nice!
I agree that deciding on a single error mechanism is good, and
\o/
yes, that'd be fantastic :D
👯 👍🏾
I'm happy. I've limited time for reviewing at the moment. If you're stuck / undecided on some decision, feel free to reach out to me. Otherwise, feel free to push stuff forward - I think you have full permissions to this repository. This PR is great, adds tests and fixes issues that exist for real. |
how does the last commit look? |
eeb0c9a
to
fc586be
Compare
Co-authored-by: Kate <[email protected]>
Co-authored-by: Marek Kubica <[email protected]>
I have a version cleanly rebased at https://github.com/kit-ty-kate/patch/commits/filename-spaces/ |
looks good, thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for finishing and refactoring the PR. I have some small suggestions but it looks good overall and I think this gets as close to parsing a filename as we can get without getting extra info about the format to parse.
@@ -2,8 +2,9 @@ | |||
(name patch) | |||
(synopsis "Patch purely in OCaml") | |||
(public_name patch) | |||
(modules patch) | |||
(wrapped false)) | |||
(modules patch patch_lib fname)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My suggestion would be dropping this and just moving the executable into a different folder, thus new modules would be picked up automatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can be done later
(** [parse s] parses [s] and returns a filename or [None] if the filename | ||
is equivalent to [/dev/null]. | ||
|
||
Returns [Error msg] in case of error. *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if this is a clear API, when I get Ok None
I am not sure I would think of /dev/null
.
I know I started with this idea originally, but I think returning Ok "/dev/null"
is better and then other tools on top can map "/dev/null" to whatever logic they find reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This module is not accessible outside of the patch library so i think we should rather make the internal API consistent with the rest of the code which expects an option (until someone proposes something better, although i'm personally fine with the current state of things)
fc586be
to
7cde69d
Compare
Thanks! |
This adds a lexer based on ocamllex to read the filenames of the patches.
Fixes #17.