-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize outputs accumulating for SegmentTermsEnum and IntersectTermsEnum #12699
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Net/net I love this idea -- it's basically the same idea as StringBuilder
(append small strings to list and concatenate only in the end), better than the O(N^2) "just keep concatenating a single bigger and bigger string" that FST does today.
I love that this change erases a longstanding TODO
also!
I wonder if we could make this appendable output more widely accessible (in oal.fst.util
) for other use cases that need Outputs<BytesRef>
, but let's not do that here.
I left some small comments ... I think this is close. It would be nice to recover the PKLookup
perf hit. Thanks @gf2121!
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/IntersectTermsEnumFrame.java
Show resolved
Hide resolved
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java
Outdated
Show resolved
Hide resolved
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java
Show resolved
Hide resolved
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java
Outdated
Show resolved
Hide resolved
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java
Show resolved
Hide resolved
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java
Outdated
Show resolved
Hide resolved
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/FieldReader.java
Outdated
Show resolved
Hide resolved
I'll try to review the latest PR soon -- thanks @gf2121. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @gf2121 -- I left mostly smallish comments, but some head scratching -- hard to remember how these two crazy iterators work.
} | ||
|
||
void pop() { | ||
this.num--; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe assert num > 0
above this line?
Also you don't need the this.
prefix when accessing the members in these methods.
|
||
void load(SegmentTermsEnum.OutputAccumulator outputAccumulator) throws IOException { | ||
outputAccumulator.prepareRead(); | ||
final long code = ite.fr.readVLongOutput(outputAccumulator); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this silly final
?
load(code); | ||
} | ||
|
||
void load(Long boxCode) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm rename to blockCode
? Not sure why it's a box :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I called it box because we are using a big Long
... blockCode is better :)
} | ||
|
||
void load(Long boxCode) throws IOException { | ||
if (boxCode != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add comment that this means this block is the first one in a possible sequence of floor blocks corresponding to a single seek point from the FST terms index?
@@ -373,9 +380,6 @@ public boolean seekExact(BytesRef target) throws IOException { | |||
|
|||
int cmp = 0; | |||
|
|||
// TODO: reverse vLong byte order for better FST |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yay, stale TODO
garbage collection!
int index; | ||
|
||
void push(BytesRef output) { | ||
if (output != Lucene90BlockTreeTermsReader.NO_OUTPUT) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be better to just check if (output.length > 0) {
? It'd handle cases (not sure this happens) where caller is passing an empty BytesRef
that is not NO_OUTPUT
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This object comparing is what we did before. I believe it is right because we have strict contract for Outputs
operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we have strict contracts, I can see that BytesSequenceOutputs#add(BytesRef, BytesRef)
has assertions that the length is > 0.
Here in OutputAccumulator
readByte()
makes a big assumption that a BytesRef
has at least length 1. If it had a length of 0, we would read past the ref end and read bytes sitting in a byte[]
that we shouldn't.
IMO OutputAccumulator
needs to be way more cautious than BytesSequenceOutputs#add(BytesRef, BytesRef)
because OutputAccumulator
isn't making copies and is relying on the underlying byte arrays not changing.
} | ||
|
||
void setFloorData(ByteArrayDataInput floorData) { | ||
assert outputIndex == num - 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we also assert index == 0
in this assert? I.e. we are at the start of the last BytesRef
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the index can be non-zero here. The last arc could contains suffix of the MSBVLong output.
this.current = outputs[0]; | ||
} | ||
|
||
void setFloorData(ByteArrayDataInput floorData) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a comment that, while this reads/decodes bytes into floorData
, it does not alter the read position?
floorDataReader.reset(floorData, 0, numBytes); | ||
public void setFloorData(SegmentTermsEnum.OutputAccumulator outputAccumulator) { | ||
outputAccumulator.setFloorData(floorDataReader); | ||
rewindPos = floorDataReader.getPosition(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move this above line 107? It smells a bit putting this after -- it requires knowing setFloorData
doesn't change the read position.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm i'm a bit confused. I think setFloorData
changes the position of floorDataReader
because it calls
floorDataReader#reset
, so we can not move this line above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Woops you're right! Sorry :)
@@ -247,7 +243,7 @@ void rewind() { | |||
nextEnt = -1; | |||
hasTerms = hasTermsOrig; | |||
if (isFloor) { | |||
floorDataReader.rewind(); | |||
floorDataReader.setPosition(rewindPos); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this is still needed? setFloorData
didn't change the position? Or might other operations before the reset()
have altered the read position? Hard to think about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need this. The floorDataReader is a ByteArrayDataInput
and its position will get changed when we read from it.
Thanks for review @mikemccand !
Agree that this is head scratching... I make a chart to try to explain this better. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great @gf2121 -- you want to merge and backport to 9.x?
Thanks for review and great suggestions @mikemccand !
Yes. I'll merge and backport this this. |
Hi @mikemccand @jpountz ! The newest nightly benchmark shows we get back some speed for PKLookUp, but still not as good as the performance before the merge of #12631. This result is not quite same as what i saw locally, but reasonable for me as we do more Since we are preparing for the release of 9.9.0. I'm not sure if it is an acceptable trade-off now. Should we backport this or revert #12631? I'd appreciate your suggestions :) |
…ectTermsEnum (apache#12699)" This reverts commit 9574cbd.
Thanks @mikemccand for feedback! I'll backport this to 9.9.0 then. |
+1 to not reverting #12631. The fact that our terms index used to not share prefixes properly was a bug, I'm glad it's fixed even though it may make performance slightly slower because more nodes have non-empty outputs to contribute. FWIW it looks like your change did not only help PKLookup, but possibly also |
…ectTermsEnum (apache#12699)" This reverts commit d92efa3.
…ectTermsEnum (apache#12699)" This reverts commit d92efa3.
This commit updates Lucene to a patched version of 9.9.0 without apache/lucene#12699. See apache/lucene#12895.
This commit updates Lucene to a patched version of 9.9.0 without apache/lucene#12699. See apache/lucene#12895.
…ectTermsEnum (apache#12699)" This reverts commit 9574cbd.
Description
Previous talk is too long to track so i opened a new PR to make a summery here. More details are available in #12661.
After merging of #12631, we found a regression for term-dictionary-related tasks like PKLookup, Fuzzy ... After some digging I suspect that the regression is caused by more
Outputs#add
andOutputs#read
on reading side, as the 'MSB VLong output format' making FST sharing more output prefix. The difference of calling times ofOutputs#add
andOutputs#read
is shown below:Solution
This patch tries to do two optimizations to get back the speed:
fp
encoded before floor data.Benchmark
Comparing to after merging of #12631
BaseLine ( After merging of #12631 )
Candidate
Comparing to before merging of #12631
BaseLine ( Before merging of #12631 )
Candidate
(Not sure why seeing tiny speed up for Orxxx tasks. I suspect it is a noise)