Skip to content

Commit

Permalink
[SYCL][Graph] Enable host-task update in graphs
Browse files Browse the repository at this point in the history
- Update spec wording to allow updating host-task function in graphs
- Support host-tasks in dynamic command-groups
- Support host-tasks in whole graph update
- Add E2E tests for both scenarios
- Fix passing incorrect accessors to graph update command after update
  • Loading branch information
Bensuo committed Jan 31, 2025
1 parent 5823125 commit fd83dfe
Show file tree
Hide file tree
Showing 12 changed files with 892 additions and 104 deletions.
92 changes: 65 additions & 27 deletions sycl/doc/extensions/experimental/sycl_ext_oneapi_graph.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -551,7 +551,7 @@ Parameters:

|===

==== Dynamic Command Groups
==== Dynamic Command Groups [[dynamic-command-groups]]

[source,c++]
----
Expand All @@ -570,12 +570,13 @@ public:
Dynamic command-groups can be added as nodes to a graph. They provide a
mechanism that allows updating the command-group function of a node after the
graph is finalized. There is always one command-group function in the dynamic
command-group that is set as active, this is the kernel which will execute for
the node when the graph is finalized into an executable state `command_graph`,
and all the other command-group functions in `cgfList` will be ignored. The
executable `command_graph` node can then be updated to a different kernel in
`cgfList`, by selecting a new active index on the dynamic command-group object
and calling the `update(node& node)` method on the executable `command_graph`.
command-group that is set as active, this is the command-group which will
execute for the node when the graph is finalized into an executable state
`command_graph`, and all the other command-group functions in `cgfList` will be
ignored. The executable `command_graph` node can then be updated to a different
kernel in `cgfList`, by selecting a new active index on the dynamic
command-group object and calling the `update(node& node)` method on the
executable `command_graph`.

The `dynamic_command_group` class provides the {crs}[common reference semantics].

Expand All @@ -584,9 +585,13 @@ about updating command-groups.

===== Limitations

Dynamic command-groups can only contain kernel operations. Trying to construct
a dynamic command-group with functions that contain other operations will
result in an error.
Dynamic command-groups can only contain the following operations:

* Kernel operations
* <<host-tasks, Host-tasks>>

Trying to construct a dynamic command-group with functions that contain other
operations will result in an error.

All the command-group functions in a dynamic command-group must have identical dependencies.
It is not allowed for a dynamic command-group to have command-group functions that would
Expand Down Expand Up @@ -625,10 +630,13 @@ Exceptions:
property for more information.

* Throws with error code `invalid` if the `dynamic_command_group` is created with
command-group functions that are not kernel executions.
command-group functions that are not kernel executions or host-tasks.

* Throws with error code `invalid` if `cgfList` is empty.

* Throws with error code `invalid` if the types of all command-groups in
`cgfList` do not match.

|
[source,c++]
----
Expand Down Expand Up @@ -829,32 +837,54 @@ possible.

===== Supported Features

The only types of nodes that are currently able to be updated in a graph are
kernel execution nodes.
The only types of nodes that are currently able to be updated in a graph are:

There are two different API's that can be used to update a graph:
* Kernel executions
* <<host-tasks, Host-tasks>>

There are two different APIs that can be used to update a graph:

* <<individual-node-update, Individual Node Update>> which allows updating
individual nodes of a command-graph.
* <<whole-graph-update, Whole Graph Update>> which allows updating the
entirety of the graph simultaneously by using another graph as a
reference.

The aspects of a kernel execution node that can be changed during update are
different depending on the API used to perform the update:
The following table illustrates the aspects of each supported node type that can be changed
depending on the API used to perform the update.

Table {counter: tableNumber}. Graph update capabilites for supported node types.
[cols="1,2a,2a"]
|===
|Node Type|<<individual-node-update, Individual Node Update>>|<<whole-graph-update, Whole Graph Update>>

|`node_type::kernel`
|

* Kernel function
* Kernel Parameters
* ND-range

|
* Kernel Parameters
* ND-range

* For the <<individual-node-update, Individual Node Update>> API it's possible to update
the kernel function, the parameters to the kernel, and the ND-range.
* For the <<whole-graph-update, Whole Graph Update>> API, only the parameters of the kernel
and the ND-range can be updated.
|`node_type::host_task`
|
* Host-task function
|
* Host-task function

|===

===== Individual Node Update [[individual-node-update]]

Individual nodes of an executable graph can be updated directly. Depending on the attribute
of the node that requires updating, different API's should be used:
Individual nodes of an executable graph can be updated directly. Depending on the attribute or `node_type` of the node that requires updating, different API's should be used:

====== Parameter Updates

_Supported Node Types: Kernel_

Parameters to individual nodes in a graph in the `executable` state can be
updated between graph executions using dynamic parameters. A `dynamic_parameter`
object is created with a modifiable state graph and an initial value for the
Expand Down Expand Up @@ -884,6 +914,8 @@ will maintain the graphs data dependencies.

====== Execution Range Updates

_Supported Node Types: Kernel_

Another configuration that can be updated is the execution range of the
kernel, this can be set through `node::update_nd_range()` or
`node::update_range()` but does not require any prior registration.
Expand All @@ -897,10 +929,13 @@ code may be defined as operating in a different dimension.

====== Command Group Updates

The command-groups of a kernel node can be updated using dynamic command-groups.
Dynamic command-groups allow replacing the command-group function of a kernel
node with a different one. This effectively allows updating the kernel function
and/or the kernel execution range.
_Supported Node Types: Kernel, Host-task_

The command-groups of a kernel node can be updated using
<<dynamic-command-groups, Dynamic Command-Groups>>. Dynamic command-groups allow
replacing the command-group function of a kernel node with a different one. This
effectively allows updating the kernel function and/or the kernel execution
range.

Command-group updates are performed by creating an instance of the
`dynamic_command_group` class. A dynamic command-group is created with a modifiable
Expand Down Expand Up @@ -1972,7 +2007,7 @@ Any code like this should be moved to a separate host-task and added to the
graph via the recording or explicit APIs in order to be compatible with this
extension.

=== Host Tasks
=== Host Tasks [[host-tasks]]

:host-task: https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#subsec:interfaces.hosttasks

Expand All @@ -1992,6 +2027,9 @@ auto node = graph.add([&](sycl::handler& cgh){
});
----

Host-tasks can be updated using <<executable-graph-update, Executable Graph Update>>.


=== Queue Behavior In Recording Mode

When a queue is placed in recording mode via a call to `command_graph::begin_recording`,
Expand Down
126 changes: 84 additions & 42 deletions sycl/source/detail/graph_impl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -580,9 +580,9 @@ graph_impl::add(std::shared_ptr<dynamic_command_group_impl> &DynCGImpl,
std::vector<std::shared_ptr<detail::node_impl>> &Deps) {
// Set of Dependent nodes based on CG event and accessor dependencies.
std::set<std::shared_ptr<node_impl>> DynCGDeps =
getCGEdges(DynCGImpl->MKernels[0]);
getCGEdges(DynCGImpl->MCommandGroups[0]);
for (unsigned i = 1; i < DynCGImpl->getNumCGs(); i++) {
auto &CG = DynCGImpl->MKernels[i];
auto &CG = DynCGImpl->MCommandGroups[i];
auto CGEdges = getCGEdges(CG);
if (CGEdges != DynCGDeps) {
throw sycl::exception(make_error_code(sycl::errc::invalid),
Expand All @@ -592,14 +592,16 @@ graph_impl::add(std::shared_ptr<dynamic_command_group_impl> &DynCGImpl,
}

// Track and mark the memory objects being used by the graph.
for (auto &CG : DynCGImpl->MKernels) {
for (auto &CG : DynCGImpl->MCommandGroups) {
markCGMemObjs(CG);
}

// Get active dynamic command-group CG and use to create a node object
const auto &ActiveKernel = DynCGImpl->getActiveKernel();
const auto &ActiveKernel = DynCGImpl->getActiveCG();
node_type NodeType =
ext::oneapi::experimental::detail::getNodeTypeFromCG(DynCGImpl->MCGType);
std::shared_ptr<detail::node_impl> NodeImpl =
add(node_type::kernel, ActiveKernel, Deps);
add(NodeType, ActiveKernel, Deps);

// Add an event associated with this explicit node for mixed usage
addEventForNode(std::make_shared<sycl::detail::event_impl>(), NodeImpl);
Expand Down Expand Up @@ -1398,11 +1400,11 @@ void exec_graph_impl::update(
"Node passed to update() is not part of the graph.");
}

if (!(Node->isEmpty() || Node->MCGType == sycl::detail::CGType::Kernel ||
Node->MCGType == sycl::detail::CGType::Barrier)) {
throw sycl::exception(errc::invalid,
"Unsupported node type for update. Only kernel, "
"barrier and empty nodes are supported.");
if (!Node->isUpdatable()) {
throw sycl::exception(
errc::invalid,
"Unsupported node type for update. Only kernel, host_task, "
"barrier and empty nodes are supported.");
}

if (const auto &CG = Node->MCommandGroup;
Expand Down Expand Up @@ -1443,23 +1445,46 @@ void exec_graph_impl::update(
}
}

// Rebuild cached requirements for this graph with updated nodes
// Rebuild cached requirements and accessor storage for this graph with
// updated nodes
MRequirements.clear();
MAccessors.clear();
for (auto &Node : MNodeStorage) {
if (!Node->MCommandGroup)
continue;
MRequirements.insert(MRequirements.end(),
Node->MCommandGroup->getRequirements().begin(),
Node->MCommandGroup->getRequirements().end());
MAccessors.insert(MAccessors.end(),
Node->MCommandGroup->getAccStorage().begin(),
Node->MCommandGroup->getAccStorage().end());
}
}

void exec_graph_impl::updateImpl(std::shared_ptr<node_impl> Node) {
// Kernel node update is the only command type supported in UR for update.
// Updating any other types of nodes, e.g. empty & barrier nodes is a no-op.
if (Node->MCGType != sycl::detail::CGType::Kernel) {
// Updating empty or barrier nodes is a no-op
if (Node->isEmpty() || Node->MNodeType == node_type::ext_oneapi_barrier) {
return;
}

// Query the ID cache to find the equivalent exec node for the node passed to
// this function.
// TODO: Handle subgraphs or any other cases where multiple nodes may be
// associated with a single key, once those node types are supported for
// update.
auto ExecNode = MIDCache.find(Node->MID);
assert(ExecNode != MIDCache.end() && "Node ID was not found in ID cache");

// Update ExecNode with new values from Node, in case we ever need to
// rebuild the command buffers
ExecNode->second->updateFromOtherNode(Node);

// Host task update only requires updating the node itself, so can return
// early
if (Node->MNodeType == node_type::host_task) {
return;
}

auto ContextImpl = sycl::detail::getSyclObjImpl(MContext);
const sycl::detail::AdapterPtr &Adapter = ContextImpl->getAdapter();
auto DeviceImpl = sycl::detail::getSyclObjImpl(MGraphImpl->getDevice());
Expand Down Expand Up @@ -1612,18 +1637,6 @@ void exec_graph_impl::updateImpl(std::shared_ptr<node_impl> Node) {
UpdateDesc.pNewLocalWorkSize = LocalSize;
UpdateDesc.newWorkDim = NDRDesc.Dims;

// Query the ID cache to find the equivalent exec node for the node passed to
// this function.
// TODO: Handle subgraphs or any other cases where multiple nodes may be
// associated with a single key, once those node types are supported for
// update.
auto ExecNode = MIDCache.find(Node->MID);
assert(ExecNode != MIDCache.end() && "Node ID was not found in ID cache");

// Update ExecNode with new values from Node, in case we ever need to
// rebuild the command buffers
ExecNode->second->updateFromOtherNode(Node);

ur_exp_command_buffer_command_handle_t Command =
MCommandMap[ExecNode->second];
ur_result_t Res = Adapter->call_nocheck<
Expand Down Expand Up @@ -1927,7 +1940,7 @@ void dynamic_parameter_impl::updateValue(const void *NewValue, size_t Size) {
for (auto &DynCGInfo : MDynCGs) {
auto DynCG = DynCGInfo.DynCG.lock();
if (DynCG) {
auto &CG = DynCG->MKernels[DynCGInfo.CGIndex];
auto &CG = DynCG->MCommandGroups[DynCGInfo.CGIndex];
dynamic_parameter_impl::updateCGArgValue(CG, DynCGInfo.ArgIndex, NewValue,
Size);
}
Expand All @@ -1950,7 +1963,7 @@ void dynamic_parameter_impl::updateAccessor(
for (auto &DynCGInfo : MDynCGs) {
auto DynCG = DynCGInfo.DynCG.lock();
if (DynCG) {
auto &CG = DynCG->MKernels[DynCGInfo.CGIndex];
auto &CG = DynCG->MCommandGroups[DynCGInfo.CGIndex];
dynamic_parameter_impl::updateCGAccessor(CG, DynCGInfo.ArgIndex, Acc);
}
}
Expand Down Expand Up @@ -2037,38 +2050,67 @@ void dynamic_command_group_impl::finalizeCGFList(
sycl::handler Handler{MGraph};
CGF(Handler);

if (Handler.getType() != sycl::detail::CGType::Kernel) {
if (Handler.getType() != sycl::detail::CGType::Kernel &&
Handler.getType() != sycl::detail::CGType::CodeplayHostTask) {
throw sycl::exception(
make_error_code(errc::invalid),
"The only type of command-groups that can be used in "
"dynamic command-groups is kernels.");
"The only types of command-groups that can be used in "
"dynamic command-groups are kernels and host-tasks.");
}

// We need to store the first CG's type so we can check they are all the
// same
if (CGFIndex == 0) {
MCGType = Handler.getType();
} else if (MCGType != Handler.getType()) {
throw sycl::exception(make_error_code(errc::invalid),
"Command-groups in a dynamic command-group must "
"all be the same type.");
}

Handler.finalize();

// Take unique_ptr<detail::CG> object from handler and convert to
// shared_ptr<detail::CGExecKernel> to store
// shared_ptr<detail::CG> to store
sycl::detail::CG *RawCGPtr = Handler.impl->MGraphNodeCG.release();
auto RawCGExecPtr = static_cast<sycl::detail::CGExecKernel *>(RawCGPtr);
MKernels.push_back(
std::shared_ptr<sycl::detail::CGExecKernel>(RawCGExecPtr));
MCommandGroups.push_back(std::shared_ptr<sycl::detail::CG>(RawCGPtr));

// Track dynamic_parameter usage in command-list
// Track dynamic_parameter usage in command-group
auto &DynamicParams = Handler.impl->MDynamicParameters;

if (DynamicParams.size() > 0 &&
Handler.getType() == sycl::detail::CGType::CodeplayHostTask) {
throw sycl::exception(make_error_code(errc::invalid),
"Cannot use dynamic parameters in a host_task");
}
for (auto &[DynamicParam, ArgIndex] : DynamicParams) {
DynamicParam->registerDynCG(shared_from_this(), CGFIndex, ArgIndex);
}
}

// For each CGExecKernel store the list of alternative kernels, not
// Host tasks don't need to store alternative kernels
if (MCGType == sycl::detail::CGType::CodeplayHostTask) {
return;
}

// For each Kernel CG store the list of alternative kernels, not
// including itself.
using CGExecKernelSP = std::shared_ptr<sycl::detail::CGExecKernel>;
using CGExecKernelWP = std::weak_ptr<sycl::detail::CGExecKernel>;
for (auto KernelCG : MKernels) {
for (std::shared_ptr<sycl::detail::CG> CommandGroup : MCommandGroups) {
CGExecKernelSP KernelCG =
std::dynamic_pointer_cast<sycl::detail::CGExecKernel>(CommandGroup);
std::vector<CGExecKernelWP> Alternatives;
std::copy_if(
MKernels.begin(), MKernels.end(), std::back_inserter(Alternatives),
[&KernelCG](const CGExecKernelSP &K) { return K != KernelCG; });

// Add all other command groups except for the current one to the list of
// alternatives
for (auto &OtherCG : MCommandGroups) {
CGExecKernelSP OtherKernelCG =
std::dynamic_pointer_cast<sycl::detail::CGExecKernel>(OtherCG);
if (KernelCG != OtherKernelCG) {
Alternatives.push_back(OtherKernelCG);
}
}

KernelCG->MAlternativeKernels = std::move(Alternatives);
}
Expand All @@ -2084,7 +2126,7 @@ void dynamic_command_group_impl::setActiveIndex(size_t Index) {
// Update nodes using the dynamic command-group to use the new active CG
for (auto &Node : MNodes) {
if (auto NodeSP = Node.lock()) {
NodeSP->MCommandGroup = getActiveKernel();
NodeSP->MCommandGroup = getActiveCG();
}
}
}
Expand Down
Loading

0 comments on commit fd83dfe

Please sign in to comment.