Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make alluxio client return block location count configurable #18448

Open
wants to merge 4 commits into
base: master-2.x
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -288,15 +288,27 @@ public List<BlockLocationInfo> getBlockLocations(URIStatus status)
if (locations.isEmpty() && mFsContext.getPathConf(new AlluxioURI(status.getPath()))
.getBoolean(PropertyKey.USER_UFS_BLOCK_LOCATION_ALL_FALLBACK_ENABLED)) {
// Case 2: Fallback to add all workers to locations so some apps (Impala) won't panic.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this situation occurs, can you directly return the default value defined here? This is more friendly to Impala's file handle cache and data cache. This is because Impala will use consistent hash to schedule data scan fragment, and random block location returns will reduce the cache hit rate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review. I don't quite understand the change method you are talking about, or is this what you are talking about?

locations.addAll(getHostWorkerMap().values());
Collections.shuffle(locations);
PropertyKey locKey = PropertyKey.USER_UFS_BLOCK_LOCATION_RETURN_LIMIT;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
PropertyKey locKey = PropertyKey.USER_UFS_BLOCK_LOCATION_RETURN_LIMIT;
PropertyKey locKey = PropertyKey.USER_UFS_BLOCK_LOCATION_FALLBACK_RETURN_LIMIT;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's rename this property key name so it better reflects what it does in the code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

int count = mFsContext.getClusterConf().getInt(locKey);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use path conf

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

if (count < 0) {
jiacheliu3 marked this conversation as resolved.
Show resolved Hide resolved
throw new IllegalArgumentException("Property" + locKey.getName()
+ " should not be set to a negative number");
}
jiacheliu3 marked this conversation as resolved.
Show resolved Hide resolved
List<WorkerNetAddress> addresses = getShuffleWorkerAddressList();
locations.addAll(addresses.subList(0, Math.min(addresses.size(), count)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line has many copies, you can just write an easy for loop

for (i = 0; i < count; i++) {
  locations.add(addresses.get(i));
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

}
}
blockLocations.add(new BlockLocationInfo(fileBlockInfo, locations));
}
return blockLocations;
}

private List<WorkerNetAddress> getShuffleWorkerAddressList() throws IOException {
List<BlockWorkerInfo> workers = mFsContext.getCachedWorkers();
Collections.shuffle(workers);
return workers.stream().map(BlockWorkerInfo::getNetAddress).collect(toList());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not keep using getHostWorkerMap()? This new code may not keep the same behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will remove the method and use getHostWorkerMap()

}

private Map<String, WorkerNetAddress> getHostWorkerMap() throws IOException {
List<BlockWorkerInfo> workers = mFsContext.getCachedWorkers();
return workers.stream().collect(
Expand Down
11 changes: 11 additions & 0 deletions core/common/src/main/java/alluxio/conf/PropertyKey.java
Original file line number Diff line number Diff line change
Expand Up @@ -6843,6 +6843,15 @@ public String toString() {
.setConsistencyCheckLevel(ConsistencyCheckLevel.WARN)
.setScope(Scope.CLIENT)
.build();
public static final PropertyKey USER_UFS_BLOCK_LOCATION_RETURN_LIMIT =
intBuilder(Name.USER_UFS_BLOCK_LOCATION_RETURN_LIMIT)
.setDefaultValue(Integer.MAX_VALUE)
.setDescription("The return count of workers as block location if ufs block locations "
+ "are not co-located with any Alluxio workers or is empty. This item should be "
+ "greater than or equal to 0 and " + Integer.MAX_VALUE + " means return all workers")
.setConsistencyCheckLevel(ConsistencyCheckLevel.WARN)
.setScope(Scope.CLIENT)
.build();
public static final PropertyKey USER_UFS_BLOCK_READ_LOCATION_POLICY =
classBuilder(Name.USER_UFS_BLOCK_READ_LOCATION_POLICY)
.setDefaultValue("alluxio.client.block.policy.LocalFirstPolicy")
Expand Down Expand Up @@ -9082,6 +9091,8 @@ public static final class Name {
public static final String USER_RPC_RETRY_MAX_SLEEP_MS = "alluxio.user.rpc.retry.max.sleep";
public static final String USER_UFS_BLOCK_LOCATION_ALL_FALLBACK_ENABLED =
"alluxio.user.ufs.block.location.all.fallback.enabled";
public static final String USER_UFS_BLOCK_LOCATION_RETURN_LIMIT =
"alluxio.user.block.location.return.limit";
public static final String USER_UFS_BLOCK_READ_LOCATION_POLICY =
"alluxio.user.ufs.block.read.location.policy";
public static final String USER_UFS_BLOCK_READ_LOCATION_POLICY_DETERMINISTIC_HASH_SHARDS =
Expand Down
Loading