Skip to content

Commit

Permalink
Documentation of TMA: part 2 (#2041)
Browse files Browse the repository at this point in the history
This PR does the following:
- Added a few more texts to `doc/dev/tma.md` describing how to create
box.
- Largely extents `doc/reading/divisibility-of-split.md` discussing
indivisible splits.
- Added `doc/reading/tma-modeling-in-depth.md` discussing deeper and
more formally about predication and correctness of TMA.

Suggested order of reading:
1.
https://github.com/NVIDIA/Fuser/blob/tma-step-2/doc/reading/divisibility-of-split.md
2. https://github.com/NVIDIA/Fuser/blob/tma-step-2/doc/dev/tma.md
3.
https://github.com/NVIDIA/Fuser/blob/tma-step-2/doc/reading/tma-modeling-in-depth.md
  • Loading branch information
zasdfgbnm authored Apr 15, 2024
1 parent a2524a5 commit edda94b
Show file tree
Hide file tree
Showing 15 changed files with 11,625 additions and 159 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -49,4 +49,5 @@ foo.bin

*_generated.*

# Mac OS internal file
.DS_Store
1 change: 1 addition & 0 deletions csrc/non_divisible_split.h
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@

namespace nvfuser {

//! See doc/reading/divisibility-of-split.md#predication
//! If an IterDomain is split and its inner output domain is
//! eventually split too, the second split must be divisible or the
//! inner domain must be predicated. This class finds Split
Expand Down
24 changes: 22 additions & 2 deletions doc/dev/tma.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
* SPDX-License-Identifier: BSD-3-Clause
-->

# TMA support in nvFuser
# TMA Support in NVFuser

## Introduction

Expand Down Expand Up @@ -72,6 +72,26 @@ Instead, it is a virtual domain that only exists in the user's mind.
Also note that the IterDomain expressions between the global tensor's allocation domain and the TMA domain must be a view,
for example, we can not merge discontiguous IterDomains ([why?](../reading/divisibility-of-split.md#merging-discontiguous-iterdomains)), and we can not have indivisible splits either.

### Step 2: create box
### Step 2: Define box

After having scheduled a TMA domain, the next step is to define box.
There are two ways of defining box: partitioning and compositing.

#### Define box by partitioning

Defining box by partitioning is as simple as: select an IterDomain in the TMA domain, then
inner split that IterDomain by the box size of that dimension.

We call this split expression a "*boxing split*", the input of this split a "*partitioned IterDomain*",
the inner output of this split a "*box IterDomain*", and the outer output of this split a "*coordinate IterDomain*".

For the case of Figure 1, if both box dimensions are defined by partitioning,
the schedule should look like the Figure 3 below:

![Figure 3: Boxing by partitioning](tma/box-by-partitioning.svg)

Please note that, although in the above example, the split is divisible, this does not have to be the case in general.

#### Define box by compositing

TODO: this documentation is under construction
317 changes: 317 additions & 0 deletions doc/dev/tma/box-by-partitioning.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit edda94b

Please sign in to comment.