Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] Device code generation for "free functions", a new way to define kernels #13207

Merged
merged 66 commits into from
May 16, 2024

Conversation

rdeodhar
Copy link
Contributor

@rdeodhar rdeodhar commented Mar 29, 2024

This PR is intended for review of the device-side code generation of "free functions".

The free function spec is here: https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/proposed/sycl_ext_oneapi_free_function_kernels.asciidoc

This PR does not implement the full specification. It does however allow free function markup and execution using a manual method.

Later PRs will add the remaining parts of the spec. This is the current status:

  • Free functions are supported at file scope only.
  • The SYCL_EXTERNAL markup is needed for free functions in addition to the SYCL_EXT_ONEAPI_FUNCTION_PROPERTY property.
  • The traits described in the specification section "New traits for kernel functions" are not yet implemented.
  • The functions described in the specification section "New kernel bundle member functions" are not yet implemented. As a result, a free-function kernel can only be launched by finding the kernel_id that matches the function's name, and this requires knowing how the compiler mangles the function's name when creating the kernel. This is a temporary solution until the "New kernel bundle member functions" are implemented.
  • The compiler does not yet diagnose an error if the application violates any of the restrictions listed in the specification under the section "Defining a free function kernel".
  • Device code generation is supported for scalars and USM pointers only. It is not supported for complex kernel argument types requiring decomposition like accessor, local_accessor, or stream.
  • The implementation has not been tested to handle the case when a kernel argument is optimized away. The switch -fno-sycl-dead-args-optimization could be used to disable this optimization, if needed
  • The kernel information descriptor info::kernel::num_args cannot yet be used to query the number of arguments in a free function kernel.

@rdeodhar rdeodhar requested review from a team as code owners March 29, 2024 15:08
@rdeodhar rdeodhar requested a review from maarquitos14 March 29, 2024 15:08
Copy link
Contributor

github-actions bot commented Mar 29, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

@gmlueck gmlueck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like great incremental progress!

It would be nice to list the remaining work in the PR description, so that we can track what still needs to be done. For example, I think the following is still TODO:

  • The traits described in the specification section "New traits for kernel functions" are not yet implemented.

  • The functions described in the specification section "New kernel bundle member functions" are not yet implemented. As a result, a free-function kernel can only be launched by finding the kernel_id that matches the function's name, and this requires knowing how the compiler mangles the function's name when creating the kernel. This is a temporary solution until the "New kernel bundle member functions" are implemented.

  • The compiler does not yet diagnose an error if the application violates any of the restrictions listed in the specification under the section "Defining a free function kernel".

I'm not sure about the following. If these are implemented, can we add a test? If not, let's add them to the list of TODO work in the PR:

  • Complex kernel argument types requiring decomposition like accessor, local_accessor, or stream cannot yet be passed to a free function kernel.

  • The implementation does not yet handle the case when a kernel argument is optimized away.

  • The kernel information descriptor info::kernel::num_args cannot yet be used to query the number of arguments in a free function kernel.

Copy link
Contributor

@elizabethandrews elizabethandrews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just glanced through the code and noticed a bunch of unrelated (possibly unintentional) formatting changes in SemaSYCL. Can you please fix that?

Copy link
Contributor

@elizabethandrews elizabethandrews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add frontend tests corresponding to your changes as well.

@rdeodhar
Copy link
Contributor Author

rdeodhar commented Apr 1, 2024

Regarding adding frontend tests, those use "fake" headers instead of the real SYCL headers. The recognition of free function kernels is based on function properties. Is there a way to fake the property setting on functions, too? Otherwise how to trigger free function code generation?

@rdeodhar
Copy link
Contributor Author

rdeodhar commented Apr 1, 2024

The SemaSYCL formatting-only changes were not made intentionally. However, the formatting in my version is actually more "correct". Anyway, I'll put it back and see if it is accepted.

Copy link
Contributor

@gmlueck gmlueck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked closely at the code, but my comments have been addressed. Obviously, there is more work to be done on this feature, but this PR is OK with me as a first step.

Copy link
Contributor

@elizabethandrews elizabethandrews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR and applying all requested changes! FE changes LGTM for incremental progress on the free function extension. As discussed follow-up patches will be required for full support.

@rdeodhar rdeodhar temporarily deployed to WindowsCILock May 15, 2024 18:33 — with GitHub Actions Inactive
@rdeodhar rdeodhar temporarily deployed to WindowsCILock May 15, 2024 19:27 — with GitHub Actions Inactive
@rdeodhar rdeodhar requested a review from AaronBallman May 15, 2024 20:42
Copy link
Contributor

@tahonermann tahonermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accepting primarily based on the strength of Elizabeth's review and promises made to address outstanding issues in a following PR.

@rdeodhar
Copy link
Contributor Author

@intel/llvm-gatekeepers, at least one member of the teams that own files in this PR have given their approval. Could you merge it in, please?

@againull againull merged commit 0d5addc into intel:sycl May 16, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants