-
Notifications
You must be signed in to change notification settings - Fork 755
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] Embed bfloat16 devicelib into executable if necessary #16729
base: sycl
Are you sure you want to change the base?
Conversation
Signed-off-by: jinge90 <[email protected]>
Signed-off-by: jinge90 <[email protected]>
Signed-off-by: jinge90 <[email protected]>
Signed-off-by: jinge90 <[email protected]>
Signed-off-by: jinge90 <[email protected]>
Signed-off-by: jinge90 <[email protected]>
Signed-off-by: jinge90 <[email protected]>
Signed-off-by: jinge90 <[email protected]>
Signed-off-by: jinge90 <[email protected]>
Signed-off-by: jinge90 <[email protected]>
Hi, @asudarsa and @AlexeySachkov |
I'm looking at this in the context of SYCL-RTC. Overall this change LGTM and I think we can adapt our equivalent of However, is it a problem if multiple of these special images are present at runtime? This would be the case if the application compiles multiple kernel bundles at runtime using bf16, or even if the application contains normal kernels using bf16 and also runtime-compiles a single bf16-using bundle. |
Hi, @jopperm
When sycl runtime begins to launch a user kernel, it will resolves all imported symbols from the belonging user's device image, if the the imported symbols include bf16 devicelib functions, sycl runtime will link the user's device image with corresponding devicelib image. Thanks very much. |
Thanks, got it, after post-link, a single |
Hi, @jopperm |
// | ||
// Run clang-linker-wrapper test | ||
// | ||
// RUN: clang-linker-wrapper -sycl-post-link-options="SYCL_POST_LINK_OPTIONS" -llvm-spirv-options="LLVM_SPIRV_OPTIONS" "--host-triple=x86_64-pc-windows-msvc" "--linker-path=/usr/bin/ld" "--" HOST_LINKER_FLAGS "-dynamic-linker" HOST_DYN_LIB "-o" "a.out" HOST_LIB_PATH HOST_STAT_LIB %t5.o -sycl-device-libraries=libsycl-crt.new.o -sycl-device-library-location=%S/Inputs/SYCL/lib --dry-run 2>&1 | FileCheck -check-prefix=CHK-CMDS-DEVICE-LIB-DIR %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this extra file? Can we not just use a dummy file? Please see
// Generate .o file as SYCL device library file. |
If you do not care about what's inside the file, then
touch %t1.devicelib.o
will suffice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @asudarsa
We need to pass "-sycl-device-library-location" to clang-linker-wrapper tool and clang-linker-wrapper will look for specified devicelib files in this specified "-sycl-device-library-location", so we put an empty devicelib file in Inputs/SYCL/lib
Thanks very much.
// | ||
// Run clang-linker-wrapper test | ||
// | ||
// RUN: clang-linker-wrapper -sycl-post-link-options="SYCL_POST_LINK_OPTIONS" -llvm-spirv-options="LLVM_SPIRV_OPTIONS" "--host-triple=x86_64-pc-windows-msvc" "--linker-path=/usr/bin/ld" "--" HOST_LINKER_FLAGS "-dynamic-linker" HOST_DYN_LIB "-o" "a.out" HOST_LIB_PATH HOST_STAT_LIB %t5.o -sycl-device-libraries=libsycl-crt.new.o -sycl-device-library-location=%S/Inputs/SYCL/lib --dry-run 2>&1 | FileCheck -check-prefix=CHK-CMDS-DEVICE-LIB-DIR %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, a bit confused. Why name this '-crt' when you are interested only in bfloat16 device lib file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @asudarsa
Since the device library list can't be empty, I just randomly put libsycl-crt in the list to avoid compiler warning.
Thanks very much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor nits. Overall looks good. Adding @maksimsab to list of reviewers to get feedback on sycl-post-link related changes.
Thanks
Signed-off-by: jinge90 <[email protected]>
Signed-off-by: jinge90 <[email protected]>
Hi, @jopperm |
Not entirely, your explanation is clear, thank you, but I was asking something else. Let me try to rephrase. Assume you have libary libX containing kernels using bf16. Your application also contains kernels using bf16 and links to libX. At runtime, the library's and the app's |
Hi, @jopperm |
Let's say it's a static lib. The situation I'm interested in is |
Hi, @jopperm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, add tests corresponding to sycl-post-link changes.
@@ -3,7 +3,7 @@ | |||
// RUN: %clangxx -### --target=x86_64-pc-windows-msvc -fsycl \ | |||
// RUN: -Xdevice-post-link -O0 %s 2>&1 \ | |||
// RUN: | FileCheck -check-prefix OPTIONS_POSTLINK_JIT_OLD %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add --sysroot=%S/Inputs/SYCL
to the command line as there is no guarantee the base lib
will be created with the compiler build (for example with the sanitizer based builds).
@@ -3,7 +3,7 @@ | |||
// RUN: %clangxx --target=x86_64-unknown-linux-gnu -fsycl -### \ | |||
// RUN: --no-offload-new-driver -Xdevice-post-link -O0 %s 2>&1 \ | |||
// RUN: | FileCheck -check-prefix OPTIONS_POSTLINK_JIT_OLD %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add --sysroot=%S/Inputs/SYCL
here as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM, but I agree with @maksimsab's point about adding tests. Some unit-tests for SYCL RT should also be added, you can look at https://github.com/intel/llvm/blob/sycl/sycl/unittests/program_manager/DynamicLinking/DynamicLinking.cpp tests for reference
// DeviceLib module doesn't include any entry point,it can be constructed | ||
// using ctor without any entry point related parameter. | ||
for (auto Fn : SYCLDeviceLibs) { | ||
if (Name == std::string(Fn)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (Name == std::string(Fn)) | |
if (Name == StringRef(Fn)) |
There is no need to allocate heap memory and make a string copy for a sake of a simple comparison
// using ctor without any entry point related parameter. | ||
for (auto Fn : SYCLDeviceLibs) { | ||
if (Name == std::string(Fn)) | ||
IsSYCLDeviceLib = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add break;
, there is no need to scan other function names once we've found what we were looking for
@@ -1798,14 +1837,28 @@ ProgramManager::kernelImplicitLocalArgPos(const std::string &KernelName) const { | |||
return {}; | |||
} | |||
|
|||
static bool skipEmptyImage(sycl_device_binary RawImg) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static bool skipEmptyImage(sycl_device_binary RawImg) { | |
static bool shouldSkipEmptyImage(sycl_device_binary RawImg) { |
Currently, sycl bfloat16 conversion functions are implemented in 2 devicelib spvs(fallback version and native version).
The native version targets for any platform which supports "cl_intel_bfloat16_conversions" extension and fallback version is used for all other platforms.
SYCL runtime will select the bfloat16 spvs during execution time by checking bfloat16 extension.
The design requires us to ship 2 spv files together with sycl runtime which some users may dislike.
The PR uses sycl dynamic library mechanism to re-implement this behavior. These 2 bfloat16 lib files are regarded as dynamic library and embedded to final executable, so we don't need to ship any bfloat16 spv libs.
The PR consists following changes:
Fallback and native version of bfloat16 devicelib files have exactly same exported functions, we add a new metadata("SYCL_DEVICELIB_BF16_TYPE") to indicate the version in them. SYCL runtime will check cl_intel_bfloat16_conversions extension and this metadata to decide which version will be linked.