You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is a race condition on which value of b is read in the second statement. For this reason, a kernel-split between the second and the third statement is required.
For this code to be translated correctly, the three statements must be executed sequentially within each iteration of the k-loop. A k-loop-split between any of these statements would break the translation.
Unfortunately, in structured codes, a multistage corresponds to both 1 kernel and 1 k-loop. A MS-split would inevitably mean both a kernel-split and a k-loop-split, making this code impossible to translate with the current IIR design.
(by running the above example through the toolchain, no split happens, so the race condition persists)
The problem hasn't made its appearance in unstructured yet, because we don't support translation of sequential k-loops at the moment, but we need to carefully think what to do about it before we jump into the implementation.
The correct translation should be to do the k-loop on the host and launch separate kernels within each iteration.
The text was updated successfully, but these errors were encountered:
Consider the following gtclang solver-like example:
There is a race condition on which value of
b
is read in the second statement. For this reason, a kernel-split between the second and the third statement is required.For this code to be translated correctly, the three statements must be executed sequentially within each iteration of the k-loop. A k-loop-split between any of these statements would break the translation.
Unfortunately, in structured codes, a multistage corresponds to both 1 kernel and 1 k-loop. A MS-split would inevitably mean both a kernel-split and a k-loop-split, making this code impossible to translate with the current IIR design.
(by running the above example through the toolchain, no split happens, so the race condition persists)
The problem hasn't made its appearance in unstructured yet, because we don't support translation of sequential k-loops at the moment, but we need to carefully think what to do about it before we jump into the implementation.
The correct translation should be to do the k-loop on the host and launch separate kernels within each iteration.
The text was updated successfully, but these errors were encountered: