This is a test plan for the APIs described in SYCL 2020 section 4.15.3. Atomic references
All of the interface tests described are performed using each of the
following types as the ReferencedType
.
In Regular mode:
-
int
-
float
In full conformance mode:
-
int
-
unsigned int
-
long
-
unsigned long
-
float
In addition, if the device has aspect::atomic64
, the following types are tested:
-
long long
-
unsigned long long
-
double
Additionaly run same tests for ReferencedType*
.
Note: using T as ReferencedType
or ReferencedType*
.
Note: all tests run with different values of DefaultOrder
and DefaultScope
template parameters, for unsupported
values tests are skiped.
To check if the special values of DefaultOrder
parameters are supported should use
info::device::atomic_memory_order_capabilities
, memory_order::relaxed
should be supported on all devices.
To check if the special values for DefaultScope
parameter are supported should use
info::device::atomic_memory_scope_capabilities
.
All non static tests run on following cases:
-
single_task
with referenced value is accessor element - should be skipped foraddress_space::local_space
-
parallel_for
with nd_range parameter and referenced value local_accessor element - should be skipped foraddress_space::global_space
For atomic_ref
with the following DefaultOrder
template parameter values:
-
memory_order::relaxed
-
memory_order::acq_rel
-
memory_order::seq_cst
And the following values as DefaultScope
template parameter:
-
memory_scope::work_item
-
memory_scope::sub_group
-
memory_scope::work_group
-
memory_scope::device
-
memory_scope::system
And the following values as Space
template parameter:
-
access::address_space::global_space
-
access::address_space::local_space
-
access::address_space::generic_space
Check the following:
-
default_read_order
,default_write_order
anddefault_read_modify_write_order
members are const -
required_alignment
memeber is const and is equal to or grater thanalignof(ReferencedType)
-
is_always_lock_free
member is const bool -
default_scope
member is const and is equal toDefaultScope
template parameter was used when object was created
If T
is ReferencedType
check that value_type
is same as ReferencedType
.
If T
is ReferencedType*
check that value_type
is same as ReferencedType*
.
If T
is Integral
or Floating
check that atomic_ref::difference_type
is same
as atomic_ref::value_type
.
If T
is ReferencedType*
check that atomic_ref::difference_type
is same
as ptrdiff_t
.
For different cases of DefaultOrder
template parameter check the following:
-
memory_order::relaxed
-
default_read_order
ismemory_order::relaxed
-
default_write_order
ismemory_order::relaxed
-
default_read_modify_write_order
ismemory_order::relaxed
-
-
memory_order::acq_rel
-
default_read_order
ismemory_order::acquire
-
default_write_order
ismemory_order::release
-
default_read_modify_write_order
ismemory_order::acq_rel
-
-
memory_order::seq_cst
-
default_read_order
ismemory_order::seq_cst
-
default_write_order
ismemory_order::seq_cst
-
default_read_modify_write_order
ismemory_order::seq_cst
-
Create an atomic_ref
object using atomic_ref(T&)
and another object using copy constructor in device code using
the following values as DefaultOrder
template parameter:
-
memory_order::relaxed
-
memory_order::acq_rel
-
memory_order::seq_cst
The following values as DefaultScope
template parameter:
-
memory_scope::work_item
-
memory_scope::sub_group
-
memory_scope::work_group
-
memory_scope::device
-
memory_scope::system
And the following values as Space
template parameter:
-
access::address_space::global_space
-
access::address_space::local_space
-
access::address_space::generic_space
After calling constructors check for:
-
atomic_ref(T&)
-
If the stored value match with the value passed as
T&
usingT load( memory_order order = default_read_order memory_scope scope = default_scope) const
. Check returned value type ofload
member function.
-
-
atomic_ref(const atomic_ref&) noexcept
-
If the members of the newly created object match that of
const atomic_ref&
and the referenced value is valid usingT load( memory_order order = default_read_order memory_scope scope = default_scope) const
.
-
To test member functions create an atomic_ref
object using atomic_ref(T&)
in device code using the following values
as DefaultOrder
template parameter:
-
memory_order::relaxed
-
memory_order::acq_rel
-
memory_order::seq_cst
And the following values as DefaultScope
template parameter:
-
memory_scope::work_item
-
memory_scope::sub_group
-
memory_scope::work_group
-
memory_scope::device
-
memory_scope::system
And the following values as Space
template parameter:
-
access::address_space::global_space
-
access::address_space::local_space
-
access::address_space::generic_space
Note: for the member functions tests below that take memory_order
and memory_scope
as arguments, the following
values of these parameters are used only if the FULL CONFORMANCE
mode is enabled:
-
For
memory_order
:-
memory_order::relaxed
-
memory_order::acquire
-
memory_order::release
-
memory_order::acq_rel
-
memory_order::seq_cst
-
-
For
memory_scope
:-
memory_scope::work_item
-
memory_scope::sub_group
-
memory_scope::work_group
-
memory_scope::device
-
memory_scope::system
-
It should be noted that memory_order::acquire
is used only for read operations, memory_order::release
for write
operations and memory_order::acq_rel
for read-modify-write operations.
If the FULL CONFORMANCE
mode is disabled only default values for functions arguments is used which are determined
by atomic_ref
template parameters that which object was created with.
bool is_lock_free()
Check if the function exists and returns true if atomic_ref::is_always_lock_free
is equal to true.
Check returned value type.
void store(T operand,
memory_order order = default_write_order,
memory_scope scope = default_scope)
For the following values as order
parameter:
-
memory_order::relaxed
-
memory_order::release
-
memory_order::seq_cst
For the following values as scope
parameter:
-
memory_scope::work_item
-
memory_scope::sub_group
-
memory_scope::work_group
-
memory_scope::device
-
memory_scope::system
Check if the function stores operand
to the object referenced by this atomic_ref
.
T operator=(T desired) const noexcept
Same as store(desired)
: Check if the function stores desired
to the object referenced by this atomic_ref
.
Check if returned value is desired
and check returned value type.
operator T() const
Check if the function loads the value of the object referenced by this atomic_ref
.
T exchange(T operand,
memory_order order = default_read_modify_write_order,
memory_scope scope = default_scope) const
For the following values as order
parameter:
-
memory_order::relaxed
-
memory_order::acq_rel
-
memory_order::seq_cst
For the following values as scope
parameter:
-
memory_scope::work_item
-
memory_scope::sub_group
-
memory_scope::work_group
-
memory_scope::device
-
memory_scope::system
Check if replaces the value of the object referenced by this atomic_ref
with value operand
and returns the original
value of the referenced object. Check returned value type.
bool compare_exchange_weak(T &expected, T desired,
memory_order success,
memory_order failure,
memory_scope scope = default_scope) const
For the following values as success
parameter:
-
memory_order::relaxed
-
memory_order::acq_rel
-
memory_order::seq_cst
For the following values as failure
parameter:
-
memory_order::relaxed
-
memory_order::acquire
-
memory_order::seq_cst
For the following values as scope
parameter:
-
memory_scope::work_item
-
memory_scope::sub_group
-
memory_scope::work_group
-
memory_scope::device
-
memory_scope::system
For equal values: it attempts to replaces the value of the referenced object with the value of desired
. This may not
be checked since it is non-deterministic.
For unequal values: check if it assigns the original value of the referenced object to expected
.
Also check if it returns true
when the comparison operation and replacement operation were successful.
Check returned value type.
bool compare_exchange_weak(T &expected, T desired,
memory_order order = default_read_modify_write_order,
memory_scope scope = default_scope) const
Same as above: Equivalent to compare_exchange_weak(expected, desired, order, order, scope)
using following values
as order
parameter:
-
memory_order::relaxed
-
memory_order::acq_rel
-
memory_order::seq_cst
bool compare_exchange_strong(T &expected, T desired,
memory_order success,
memory_order failure,
memory_scope scope = default_scope) const
For the following values as success
parameter:
-
memory_order::relaxed
-
memory_order::acq_rel
-
memory_order::seq_cst
For the following values as failure
parameter:
-
memory_order::relaxed
-
memory_order::acquire
-
memory_order::seq_cst
For the following values as scope
parameter:
-
memory_scope::work_item
-
memory_scope::sub_group
-
memory_scope::work_group
-
memory_scope::device
-
memory_scope::system
For equal values: check if it replaces the value of the referenced object with the value of desired
.
For unequal values: check if it assigns the original value of the referenced object to expected
.
Also check if it returns true
when the comparison operation was successful.
Check returned value type.
bool compare_exchange_strong(T &expected, T desired,
memory_order order =
default_read_modify_write_order) const
Same as above: Equivalent to compare_exchange_strong(expected, desired, order, order, scope)
using following values
as order
parameter:
-
memory_order::relaxed
-
memory_order::acq_rel
-
memory_order::seq_cst
For the following values as order
parameter:
-
memory_order::relaxed
-
memory_order::acq_rel
-
memory_order::seq_cst
For the following values as scope
parameter:
-
memory_scope::work_item
-
memory_scope::sub_group
-
memory_scope::work_group
-
memory_scope::device
-
memory_scope::system
Test following functions:
T fetch_add(difference_type operand,
memory_order order = default_read_modify_write_order,
memory_scope scope = default_scope) const
Check if it adds operand
to the value of the object referenced by this atomic_ref
and assigns the result to the
value of the referenced object.
If T
is Floating
check that new value of the referenced object is equal to (previous value + operand) +- epsilon
.
Check if it returns the original value of the referenced object. Check returned value type.
T operator+=(difference_type operand) const
Same as above: Equivalent to fetch_add(operand) + operand
.
T fetch_sub(difference_type operand,
memory_order order = default_read_modify_write_order,
memory_scope scope = default_scope) const
Check if it subtracts operand
from the value of the object referenced by this atomic_ref
and assigns the result to
the value of the referenced object.
If T
is Floating
check that new value of the referenced object is equal to (previous value + operand) +- epsilon
.
Check if it returns the original value of the referenced object. Check returned value type.
T operator-=(difference_type operand) const
Same as above: Equivalent to fetch_sub(operand) - operand
.
For the following values as order
parameter:
-
memory_order::relaxed
-
memory_order::acq_rel
-
memory_order::seq_cst
For the following values as scope
parameter:
-
memory_scope::work_item
-
memory_scope::sub_group
-
memory_scope::work_group
-
memory_scope::device
-
memory_scope::system
Test following functions if T
is Integral
, skip otherwise:
T fetch_and(T operand,
memory_order order = default_read_modify_write_order,
memory_scope scope = default_scope) const
Check if it performs a bitwise AND between operand
and the value of the object referenced by this atomic_ref
, and
assigns the result to the value of the referenced object. Check if it returns the original value of the
referenced object.
Check returned value type.
T operator&=(T operand) const
Same as above: Equivalent to fetch_and(operand) & operand
.
T fetch_or(T operand,
memory_order order = default_read_modify_write_order,
memory_scope scope = default_scope) const
Check if it performs a bitwise OR between operand
and the value of the object referenced by this atomic_ref
, and
assigns the result to the value of the referenced object. Check if it returns the original value of
the referenced object.
Check returned value type.
T operator|=(T operand) const
Same as above: Equivalent to fetch_or(operand) | operand
.
T fetch_xor(T operand,
memory_order order = default_read_modify_write_order,
memory_scope scope = default_scope) const
Check if it performs a bitwise XOR between operand
and the value of the object referenced by this atomic_ref
, and
assigns the result to the value of the referenced object. Check if it returns the original value of
the referenced object.
Check returned value type.
T operator^=(T operand) const
Same as above: Equivalent to fetch_xor(operand) ^ operand
.
Test following functions if T
is Integral
or ReferencedType*
, skip otherwise:
T operator++(int) const
Same as fetch_add
: Equivalent to fetch_add(1)
.
T operator++() const
Same as fetch_add
: Equivalent to fetch_add(1) + 1
.
T operator--(int) const
Same as fetch_sub
: Equivalent to fetch_sub(1)
.
T operator--() const
Same as fetch_sub
: Equivalent to fetch_sub(1) - 1.
Additional member functions available on an object of type atomic_ref<T>
for integral and floating T
For the following values as order
parameter:
-
memory_order::relaxed
-
memory_order::acq_rel
-
memory_order::seq_cst
For the following values as scope
parameter:
-
memory_scope::work_item
-
memory_scope::sub_group
-
memory_scope::work_group
-
memory_scope::device
-
memory_scope::system
Test following functions if T
is Integral
or Floating
, skip otherwise:
T fetch_min(T operand,
memory_order order = default_read_modify_write_order,
memory_scope scope = default_scope) const
Check if it computes the minimum of operand
and the value of the object referenced by this atomic_ref
,
and assigns the result to the value of the referenced object.
If T
is Floating
check that new value of the referenced object is equal to (minimum value + operand) +- epsilon
.
Check if it returns the original value of the referenced object. Check returned value type.
T fetch_max(T operand,
memory_order order = default_read_modify_write_order,
memory_scope scope = default_scope) const
Check if it computes the maximum of operand
and the value of the object referenced by this atomic_ref
,
and assigns the result to the value of the referenced object.
If T
is Floating
check that new value of the referenced object is equal to (maximum value + operand) +- epsilon
.
Check if it returns the original value of the referenced object. Check returned value type.
This test uses a simple-form parallel_for
to atomically update a single scalar value from all work-items:
TYPE val{};
sycl::buffer buf{&val, {1}};
q.submit([&](sycl::handler &cgh) {
sycl::accessor acc{buf, cgh};
cgh.parallel_for({SIZE}, [=](auto i) {
sycl::atomic_ref<TYPE, ORDER, SCOPE, SPACE> a_r{acc[0]};
a_r.fetch_add(2);
});
});
Verify that val == SIZE*2
after the loop completes.
(The result must be accurate within some small error range when TYPE
is a floating point type.)
This test is repeated for all combinations of supported values listed below:
These TYPE
types:
-
int
-
float
-
long long
(ifsizeof(long long) == 8
then only if the device hasaspect::atomic64
) -
double
(only if the device hasaspect::fp64
and ifsizeof(double) == 8
only if the device hasaspect::atomic64
)
These SCOPE
values which are supported by the device (according to info::device::atomic_memory_scope_capabilities
):
-
memory_scope::device
-
memory_scope::system
These ORDER
values which are supported by the device (according to info::device::atomic_memory_order_capabilities
):
-
memory_order::relaxed
-
memory_order::acq_rel
-
memory_order::seq_cst
These SPACE
values:
-
access::address_space::global_space
-
access::address_space::generic_space
This test uses a parallel_for
with nd_range
parameter to atomically update array, each element from all work-items in one work-group:
std::array<TYPE, GROUP_RANGE> vals;
vals.fill(0);
sycl::buffer buf{vals.data(), {GROUP_RANGE}};
q.submit([&](sycl::handler &cgh) {
sycl::accessor acc{buf, cgh};
sycl::local_accessor<TYPE> lacc{{1}, cgh}
cgh.parallel_for(sycl::nd_range(GROUP_RANGE * LOCAL_SIZE, LOCAL_SIZE),
[=](auto item) {
sycl::atomic_ref<TYPE, ORDER, SCOPE, SPACE> a_r{lacc[0]};
if (a_r.fetch_sub(2) == -(LOCAL_SIZE*2)) {
acc[item.group_id()] = a_r.load();
}
});
});
Verify that val[i] == -LOCAL_SIZE*2
after the loop completes.
(The result must be accurate within some small error range when TYPE
is a floating point type.)
This test is repeated for all combinations of supported values listed below:
These TYPE
types:
-
int
-
float
-
long long
(ifsizeof(long long) == 8
then only if the device hasaspect::atomic64
) -
double
(only if the device hasaspect::fp64
and ifsizeof(double) == 8
only if the device hasaspect::atomic64
)
These SCOPE
values which are supported by the device (according to info::device::atomic_memory_scope_capabilities
):
-
memory_scope::device
-
memory_scope::system
-
memory_scope::work_group
These ORDER
values which are supported by the device (according to info::device::atomic_memory_order_capabilities
):
-
memory_order::relaxed
-
memory_order::acq_rel
-
memory_order::seq_cst
These SPACE
values:
-
access::address_space::local_space
-
access::address_space::generic_space
This test uses a parallel_for
with nd_range
parameter to atomically update shared scalars in two threads within local group.
const TYPE val chaged_val = 42;
sycl::nd_range nd_range(GLOBAL_RANGE, 2);
std::array<bool, GLOBAL_RANGE / 2> res;
res.fill(false);
sycl::buffer buf{res.data(), {GLOBAL_RANGE/2}};
q.submit([&](sycl::handler &cgh) {
sycl::accessor res_acc{buf, cgh};
sycl::local_accessor<TYPE, 0> x{cgh};
sycl::local_accessor<TYPE, 0> y{cgh};
sycl::local_accessor<TYPE, 0> A{cgh};
sycl::local_accessor<TYPE, 0> B{cgh};
cgh.parallel_for(nd_range, sycl::range(2), [=](auto item) {
sycl::atomic_ref<TYPE, ORDER, SCOPE, SPACE> refA{A};
sycl::atomic_ref<TYPE, ORDER, SCOPE, SPACE> refB{B};
refA.store(0);
refB.store(0);
sycl::group_barrier(item.get_group());
if (item.get_local_id() == 0) {
x = refA.load();
refB.store(1);
} else {
y = refB.load();
refA.store(1);
}
sycl::group_barrier(item.get_group());
if (item.get_local_id() == 0)
res_acc[item.get_group__id()] = !(x == 1 && y == 1);
});
});
Verify that all elements of res is true
after the loop completes.
This test is repeated for all combinations of supported values listed below:
These TYPE
types:
-
int
-
float
-
long long
(ifsizeof(long long) == 8
then only if the device hasaspect::atomic64
) -
double
(only if the device hasaspect::fp64
and ifsizeof(double) == 8
only if the device hasaspect::atomic64
)
These SCOPE
values which are supported by the device (according to info::device::atomic_memory_scope_capabilities
):
-
memory_scope::device
-
memory_scope::system
-
memory_scope::work_group
These ORDER
values which are supported by the device (according to info::device::atomic_memory_order_capabilities
):
-
memory_order::acq_rel
-
memory_order::seq_cst
These SPACE
values:
-
access::address_space::local_space
-
access::address_space::generic_space
This test uses a parallel_for
with nd_range
parameter to atomically read scalar in one thread,
and then aquire all changes from it in another thread within local group.
size_t local_range = q.get_info<sycl::info::device::max_work_group_size>();
sycl::nd_range nd_range(GLOBAL_RANGE, local_range);
std::array<bool, GLOBAL_RANGE> res;
res.fill(false);
sycl::buffer buf{res.data(), {GLOBAL_RANGE}};
q.submit([&](sycl::handler &cgh) {
sycl::accessor res_acc{buf, cgh};
sycl::local_accessor<TYPE, 0> local_acc{cgh};
sycl::local_accessor<bool> arr_acc{{local_range}, cgh};
cgh.parallel_for(nd_range, [=](auto item) {
arr_acc[item.get_local_id()] = false;
sycl::group_barrier(item.get_group());
sycl::atomic_ref<TYPE, memory_order::relaxed, SCOPE, SPACE> a_r{local_acc};
arr_acc[item.get_local_id()] = true;
a_r.store(item.get_local_id(), ORDER1);
res_acc[item.get_global_id()] = (arr_acc[a_r.load(ORDER2)] == true);
});
});
Verify that all elements of res is true
after the loop completes.
This test is repeated for all combinations of supported values listed below:
These TYPE
types:
-
int
-
float
-
long long
(ifsizeof(long long) == 8
then only if the device hasaspect::atomic64
) -
double
(only if the device hasaspect::fp64
and ifsizeof(double) == 8
only if the device hasaspect::atomic64
)
These SCOPE
values which are supported by the device (according to info::device::atomic_memory_scope_capabilities
):
-
memory_scope::device
-
memory_scope::system
-
memory_scope::work_group
These ORDER1
and ORDER2
values which are supported by the device (according to info::device::atomic_memory_order_capabilities
):
-
ORDER1
ismemory_order::release
andORDER2
ismemory_order::acquire
-
ORDER1
andORDER2
are bothmemory_order::seq_cst
These SPACE
values:
-
access::address_space::local_space
-
access::address_space::generic_space
This check should use #ifdef __cpp_lib_atomic_ref
so it can be skipped
if std::atomic_ref is not supported.
This check is skipped if device doesn’t support aspect::usm_atomic_shared_allocations
.
This check is skipped if device doesn’t support memory_scope::system
in info::device::atomic_memory_scope_capabilities
.
This test uses a simple-form parallel_for
to atomically update a single scalar value from all work-items and simultainiously update it on host:
TYPE *pval = sycl::malloc_shared<TYPE>(1, q);
*pval = 0;
std::atomic_ref a_host{*pval};
auto event = q.submit([&](sycl::handler &cgh) {
cgh.parallel_for({SIZE}, [=](auto i) {
sycl::atomic_ref<TYPE, ORDER, sycl::memory_scope::system, SPACE> a_dev{*pval};
a_dev++;
});
});
for (int i = 0; i < COUNT; i++)
a_host++;
event.wait();
Verify that *pval == SIZE+COUNT
.
This test is repeated for all combinations of supported values listed below:
These ORDER
values which are supported by the device (according to info::device::atomic_memory_order_capabilities
):
-
memory_order::relaxed
-
memory_order::acq_rel
-
memory_order::seq_cst
These SPACE
values:
-
access::address_space::global_space
-
access::address_space::generic_space
Run the same test as above with malloc_host
instead of malloc_shared
if device supports aspect::usm_atomic_host_allocations
.