-
Notifications
You must be signed in to change notification settings - Fork 54
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
This is a followup of #3795. Instead of naively inserting one `tcgen05.alloc` at the beginning, now we do real analysis on the TMem tensors, and generate the correct number of `tcgen05.alloc` s based on the analysis. As noted on `[Tensor Memory Allocation]`, allocating TMem can be a very hard problem, and at this stage, it does not make sense to start investing time on writing a perfect allocator. So the goal of this PR is to provide a solution that is hackable (so that in the future, when we want to try different allocation strategies, we can easily hack our codebase to achieve our goal) and extensible (so that in the future, when we get a better idea on what is a good allocation strategy, most of the code developed in this PR can still be reused, instead of abandoning everything and rewrite a new one from scratch). With this goal in mind, this PR adds a way to represent "how we want to allocate TMem" (`struct TMemAlllocationInfo`), a lowering pass that translate this representation into kernel IR, and a naive heuristics that generate a simple `TMemAlllocationInfo`. Regarding the topic of "allocating TMem", I believe the only thing missing after this PR is the insertion of `tcgen05.dealloc`s, which will be in the next PR. We might want to go back to this topic after we start looking at perf, but before that, I consider the topic of "allocating TMem" as done after the next PR. Note that the allocation size is hard coded to be "whole 32 columns" for now. This is clearly wrong, but I would categorize this task into the topic "the scheduling and indexing of TMem", which is the next thing I will do after the "allocating TMem" topic is done. I suggest start reviewing this PR from the code comment in `csrc/device_lower/analysis/tensor_memory.h`
- Loading branch information
Showing
8 changed files
with
354 additions
and
53 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.