-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add regx pattern file filter for distributed load #18311
Add regx pattern file filter for distributed load #18311
Conversation
332ad1a
to
1b8a977
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM. We need to add these filter information into journal entry so we can recover from crash. Also can we add a test for this? We can try to follow loadcommandTest
@@ -60,11 +62,23 @@ public Job<?> create() { | |||
.ofNullable(AuthenticatedClientUser.getOrNull()) | |||
.map(User::getName); | |||
|
|||
Predicate<UfsStatus> predicate = Predicates.alwaysTrue(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These logic should also appear at JournaledLoadJobFactory so we can recover from journal entry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. I put these logic into JournaledLoadJobFactory.
UnderFileSystem ufs = mFs.getUfsManager().getOrAdd(new AlluxioURI(path), | ||
UnderFileSystemConfiguration.defaults(Configuration.global())); | ||
Iterable<UfsStatus> iterable = new UfsStatusIterable(ufs, path, | ||
Optional.ofNullable(AuthenticatedClientUser.getOrNull()).map(User::getName), | ||
Predicates.alwaysTrue()); | ||
predicate); | ||
return new DoraLoadJob(path, user, UUID.randomUUID().toString(), bandwidth, partialListing, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also want to put this filter information into DoraLoadJob so we can turn this job into a journal entry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
1409f44
to
598f723
Compare
Updated. Tests are added in |
# Conflicts: # dora/core/server/master/src/main/java/alluxio/master/job/JournalLoadJobFactory.java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, one minor test verification needed
Thread.sleep(1000); | ||
} | ||
assertTrue(mOutput.toString().contains("Inodes Processed: 1")); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
verify only B is loaded, not A&C
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
4e3efab
to
ad2ac8b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
alluxio-bot, merge this please |
Add regx pattern file filter for distributed load. **Example:** The following request allows us to load the files under `/test-load` directory with "hello" prefix: `curl -X GET http://localhost:28080/v1/load?path=s3a://jiamingmai-test/test-load&opType=submit&verbose=true&fileFilterRegx=^hello.*` pr-link: Alluxio#18311 change-id: cid-4ec2bfe58bfba413f6d2925f5b3937bd6f5c2eb1
Add regx pattern file filter for distributed load.
Example:
The following request allows us to load the files under
/test-load
directory with "hello" prefix:curl -X GET http://localhost:28080/v1/load?path=s3a://jiamingmai-test/test-load&opType=submit&verbose=true&fileFilterRegx=^hello.*