Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor code generation for pointwise operation & PointwiseDynamicFunction #167

Merged
merged 31 commits into from
Sep 12, 2024
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
eaba92b
Refar code generation for pointwise operation & PointwiseDynamicFunction
iclementine Aug 15, 2024
c53f765
add config as a parameter for PointwoseDynamicFunction
iclementine Aug 16, 2024
a474aa3
add cases to test that manually instantiated overload does not suport…
iclementine Aug 16, 2024
73eb77c
1. add each operand's stride order as tl.constexprs, which is needed …
iclementine Aug 16, 2024
932048e
add pointwise_dynamic into CI
iclementine Aug 16, 2024
24d5245
add a test for flip op with input that is not c-contiguous
iclementine Aug 16, 2024
b73dea9
fix a bug in stride computation from shape
iclementine Aug 19, 2024
27efaa3
1. add test case where operands have different stride order;
iclementine Aug 20, 2024
f842229
1. add test case where operands have different stride order;
iclementine Aug 20, 2024
41b844b
remove redundant code in FunctionSchema
iclementine Aug 20, 2024
8eac7f0
fix a bug in PointwiseDynamicFunction's cache key: string was used as…
iclementine Aug 21, 2024
9972067
change class name OPDesc -> FunctionSchema in some text message
iclementine Aug 21, 2024
651d7d4
skip stride order computation when task rank is 1; use a slightly fas…
iclementine Aug 22, 2024
67767d8
add nd tile style codegen without block pointer
iclementine Aug 26, 2024
e4a8fde
fix typos
iclementine Aug 26, 2024
65e8400
merge upstream changes
iclementine Aug 26, 2024
55f437b
merge upstream changes
iclementine Aug 26, 2024
1e0de3f
fix conflict in gitignore
iclementine Aug 26, 2024
9ed7425
fix import of some functions to work in triton 3.0.0
iclementine Aug 26, 2024
3ccf20a
add test for tensors that requires int64 indexing
iclementine Aug 27, 2024
8085cf9
add test_pointwise_dynamic into CI tests
iclementine Aug 27, 2024
b356e69
add codegen for using 1d tile
iclementine Aug 27, 2024
c047ce5
minor change to 1d tile code generation: move tile_id assignment into…
iclementine Aug 28, 2024
79b5370
merge upstream
iclementine Sep 9, 2024
4a2fb57
add test for shape_utils
iclementine Sep 9, 2024
df41cc7
add test for tensor wrapper
iclementine Sep 10, 2024
eadad1f
Merge branch 'master' of github.com:FlagOpen/FlagGems into nditer
iclementine Sep 10, 2024
5cf34a3
skip some tests when triton version is less than 3; prefer 1d tile wh…
iclementine Sep 10, 2024
d69cbb8
skip tests that requires too much memory
iclementine Sep 10, 2024
f6c8052
do not upcast to float64 for test_floor_div
iclementine Sep 10, 2024
d0c0c52
add test_tensor_wrapper into CI steps
iclementine Sep 11, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/python-test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ jobs:
"CUDA_VISIBLE_DEVICES=3 pytest -s tests/test_reduction_ops.py &"
"CUDA_VISIBLE_DEVICES=4 pytest -s tests/test_special_ops.py &"
"CUDA_VISIBLE_DEVICES=5 pytest -s tests/test_libentry.py &"
"CUDA_VISIBLE_DEVICES=5 pytest -s tests/test_pointwise_dynamic.py &"
)

declare -a exit_statuses
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -21,3 +21,4 @@ build/
# Project files, i.e. `.project`, `.actionScriptProperties` and `.flexProperties`
# should NOT be excluded as they contain compiler settings and other important
# information for Eclipse / Flash Builder.
playground/
iclementine marked this conversation as resolved.
Show resolved Hide resolved
14 changes: 11 additions & 3 deletions src/flag_gems/ops/flip.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,12 @@
import triton

from ..utils import pointwise_dynamic
from ..utils.tensor_wrapper import StridedBuffer


@pointwise_dynamic(is_tensor=[True], promotion_methods=[(0, "DEFAULT")])
@triton.jit
def flip_func(x, **kwargs):
def copy_func(x):
return x


Expand All @@ -29,10 +30,17 @@ def flip(A: torch.Tensor, dims) -> torch.Tensor:
n = 0
offset = 0
for i in range(len(flip_dims_b)):
if flip_dims_b[i] and A.size()[i] > 1 and A.stride()[i] != 0:
if flip_dims_b[i] and A.size(i) > 1 and A.stride(i) != 0:
offset += strides[i] * (A.shape[i] - 1)
strides[i] = -strides[i]
n += 1
if n == 0 or A.numel() <= 1:
return A.clone()
return flip_func(A, out0_offset=offset, out0_strides=strides)
out = torch.empty_like(A)
# a flipped view of A
flipped_A = StridedBuffer(A, strides=strides, offset=offset)

# TODO: flip op can have a custom task simplification method, but we skip it now and just use A's rank.
overload = copy_func.instantiate(A.ndim)
overload(flipped_A, out0=out)
return out
3 changes: 3 additions & 0 deletions src/flag_gems/utils/code_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,9 @@ def writelines(self, lines):
for line in lines:
self.writeline(line)

def writemultiline(self, s):
self.writelines(s.splitlines())

def indent(self, offset=1):
@contextlib.contextmanager
def ctx():
Expand Down
Loading