-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid get block info when positioned read for first retry #18417
Avoid get block info when positioned read for first retry #18417
Conversation
@jiacheliu3 @dbw9580 Would you like to take a look at this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
BlockInfo blockInfo = lastException == null | ||
? mStatus.getBlockInfo(blockId) : mBlockStore.getInfo(blockId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dbw9580 do you think it's possible that we can use mStatus.getBlockInfo(blockId)
even if lastException != null
?
@@ -302,11 +302,15 @@ private int positionedReadInternal(long pos, byte[] b, int off, int len) throws | |||
try { | |||
// Positioned read may be called multiple times for the same block. Caching the in-stream | |||
// allows us to avoid the block store rpc to open a new stream for each call. | |||
BlockInfo blockInfo = lastException == null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a threshold of outdate time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updateStream check outdate time
This looks like flaky test.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM. Can you add a new unit test to cover this new configuration and its behavior?
Co-authored-by: Bowen Ding <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
good to merge after all checks are green |
the new UT doesn't work?
|
@jiacheliu3 Glad to say the ci checks are all passed, Please talk a look whether this could be merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with one minor comment. Good to merge once added. Thanks a lot for the work!
@@ -6307,6 +6307,11 @@ public String toString() { | |||
.setConsistencyCheckLevel(ConsistencyCheckLevel.WARN) | |||
.setScope(Scope.CLIENT) | |||
.build(); | |||
public static final PropertyKey USER_FILE_IN_STREAM_STATUS_EXPIRATION_TIME = | |||
durationBuilder(Name.USER_FILE_IN_STREAM_STATUS_EXPIRATION_TIME) | |||
.setDefaultValue("5min") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whenever a property key is added, remember to add description:
setDescription("Specifies how long the file metadata can be cached and reused during the FileInStream. Once the specified expiration time has elapsed, the file metadata will be reloaded from the Alluxio master. The cache reduces the number of metadata requests to the Master. The default is 5 minutes.")
alluxio-bot, merge this please |
What changes are proposed in this pull request?
Reduce talk to master to improve the performance for small file scenario.
I guess this is introduced by ed53d54
Why are the changes needed?
The block info can be get from file status, so we should not ask it from master every time.
Does this PR introduce any user facing changes?
No