-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
match beginning and end of line correctly #3575
base: master
Are you sure you want to change the base?
Conversation
6a8e479
to
c0ba33e
Compare
This "padded regex" approach feels hacky. Maybe it is reliable (I haven't analyzed it carefully yet), but how about a straightforward approach instead: search not just within the given range ( |
That doesn't work because |
Ok, fair point. |
926e411
to
2e3bc77
Compare
I've force-pushed a new version:
I've also rebased the PR so that one can test it together with the changes to the search functionality done in previous PRs. |
Thanks, LGTM (apart from my above comment about comments, and the fact that it's better to squash such fix-up changes into one commit, but we can squash them later). |
Side note: looks like we can deduplicate the code in |
I've combined Moreover, the code at the beginning of One might think of making EDIT: We might also want the first argument to |
When |
8bd0cac
to
3e16d5b
Compare
LGTM, but I've just noticed that it still doesn't fix the issue in the case of So I guess we also need a similar fix in (Unless we treat this as the intended behavior, but then it's inconsistent with |
I was thinking of modifying |
@JoeKar what do you think about this PR? Maybe any ideas of an alternative approach?
I'm fine either way. Although it seems reasonable to address it in this PR, since it's rather a part of the same issue, and the way to fix it is probably just to adopt the same approach as in |
Indeed it is hacky, but it seems to fit the needs via the help of regular expressions on its own. Otherwise we've to solve it with additional code dropping (parts of) results. |
Well, yes, we could compile the padded regexps in Not even mentioning the fact that we compile the same regex every time when calling |
Hmm... Throughout the entire loop, That should also address your concern about compiling twice, since even when we call |
Several questions and comments about the extension to
|
Sounds good. 👍 |
As a first step, I've implemented @dmaluka's suggestion to remove |
Maybe... Hard to say without seeing an actual implementation. IIUC it is not directly related to the issue addressed by this PR, it would be just an extra improvement?
I wouldn't worry too much about that. We already changed
Yep, and that seems inevitable. And I imagine we wouldn't need to do that per line, we can just call
I'm not sure. A file may be large, the number of matches may be huge, do we really want to store and return them all and then process them, if we can instead just call
Maybe. Again, hard to discuss without an actual implementation. |
LGTM. |
Here is a modification of EDIT: Calls of the form |
e91bc2e
to
7943952
Compare
I've force-pushed a slightly modified version. It works the same, but the code is cleaner now. |
So please squash the last 2 commits into one (to let reviewers review the relevant version, not waste time on the outdated one). |
7943952
to
3822986
Compare
|
||
from := Loc{0, i}.Clamp(start, end) | ||
to := Loc{charCount, i}.Clamp(start, end) | ||
matches := b.findAll(search, from, to) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still, what is the advantage of using findAll()
to remember all matches in some temporary storage to handle them afterwards, instead of just using findDown()
and handling each match right away?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One could do that. I wrote the code with the idea in mind to have some public FindAll
function at a later point. That code would be based on the current findAll
. One could change findAll
to some findAllFunc
that takes a callback function as argument (simialr to ReplaceAllFunc
). This way the problem of creating temporary lists would disappear. (That would also be a good idea for a public function.) Currently findAll
is only called for the first and last lines, so that performance doesn't really matter. That would change of course if we used this function for all lines.
result = search.Expand(result, replace, in, submatches) | ||
l := b.LineBytes(i) | ||
charCount = util.CharacterCount(l) | ||
if i == start.Y || i == end.Y { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we just do this findDown()
based logic (instead of the old search.ReplaceAllFunc()
based logic) for all lines, not just for the first and last line?
And considering my other comment, I imagine we don't even need to deal with individual lines here, we can simply call findDown()
in a loop for the entire range? Let findDown()
itself care about individual lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can do that. We just have to make sure that we know how the length of the last line changes.
Regarding performance: Currently we create one delta for each "middle" line, even if there is no replacement in that line. If we use the other code, we would have one delta for each replacement. I think that in most cases that would be more efficient. However, if there are many changes in each line, then this would become very inefficient. Also note that without remembering the padded regexps, we would compile the front-padded regexps once for each replacement, so we should probably store it somewhere.
When searching with
buf:FindNext
, the start of the search region currently always matches^
and the end of the region$
, no matter if they are the indeed the start or end of a line. Similar problems exist with other empty-string pattern like\b
. The goal of this PR is to fix this. This is done by modifying the search pattern for the first and last lines (prepending and/or appending.
.Code common to
findDown
andfindUp
has been moved to separate functions. One could tweak the specific way it's done, but maybe it's better to first get some feedback if you are willing to go in this direction at all.