-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Named back-references validated too early #17
Comments
Why does perl's behavior make sense? 2014-02-21 3:26 GMT+01:00 nbtrap [email protected]:
|
I can think of two reasons:
By the way, Perl itself doesn't always handle things like the second example above correctly, but the developers have confirmed that it's a bug. |
Crap. I didn't know CL-PPCRE tries to match named backreferences against every register with a given name. This complicates things. Here are the options as I see them:
My favorite is
|
I have no opinion regarding this. Maybe Edi has one. -Hans 2014-02-22 15:12 GMT+01:00 nbtrap [email protected]:
|
Are you in a position to ask him? |
I Cc'd him in my previous comment.
|
Named registers came in late and were modeled after AllegroCL's named registers, not after Perl's. I don't think ALLOW-NAMED-REGISTERS is unnecessary. Changing its value obviously changes the semantics of some regular expressions. While CL-PPCRE is based on Perl's syntax, I wouldn't get out of the way to copy each of its features. There are several regex variants out there which are close to but not similar to what Perl does and people seem to be able to cope with it. I generally think that being backwards compatible is a good thing. Just because you haven't used a feature, that doesn't mean nobody else has. In my private code repository I count almost 100 commercial Lisp projects I've done in the last decade (and that doesn't include code with its own repository). Much of this code is still in use and sometimes there are requirements to extend it or to fix something. I'd rather have code that, say, "uses CL-PPCRE" and not "uses CL-PPCRE, but can only use it up to version x.y.z". Having said that, the most important feature for me in case of any changes would be clear documentation. |
To be clear, I wasn't saying named registers are unnecessary. What I am saying is that it's unnecessary to have to bind a variable to use them. Deprecating the Rather, the feature that I was talking about changing was this idea that named back-references can refer to multiple registers, whereas numbered registers have only one referent. The whole point of named registers, in my mind, is to help more clearly disambiguate registers when the regex contains so many as might otherwise confuse the person reading the code. That being the case, I don't think named registers should have different semantics from regular registers. I agree with what you say about not having to be like Perl in every respect. In implementing subpattern references, I discovered that Perl does not have clearly defined semantics for that feature. I'm trying to convince them to adopt the definition I've implemented (they seem rather open to it), but if they reject it, I'll still probably stick with what I already have. |
Deprecating ALLOW-NAMED-REGISTERS would break backwards compatibility in two ways:
As to your second point: With the current behavior, you can for example use a regex like "(?[1-3])..(?[4-6])\" to match "12341" as well as "12344". I don't think I've ever used that, but my experience from 15 years of open source development tells me that just because you and I can't come up with a use case for this feature doesn't mean nobody else ever came up with one... :) |
|
That said, I don't want to get carried away with the |
I have updated my previous comment. I forgot a couple of backslashes for GitHub's markdown. Otherwise, I've said what I wanted to say on this issue. The discussion is getting a bit too ridiculous for my taste. |
The issue referred to there is one of the the subjects of edicl#17.
CL-PPCRE validates register names too early in the regex compilation phase. The converter should wait until it has seen all register names before asserting that named back-references refer to existing registers.
Compare this to Perl:
I plan on fixing this in my subpattern references branch, which is nearing completion.
The text was updated successfully, but these errors were encountered: