-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HOTFIX] Fix .altmacro compile issues #3490
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edit: Ill re-review once the additional changes are complete.
src/kernels/conv1x1u_bias_activ.s
Outdated
@@ -1052,7 +1052,7 @@ last_wave: | |||
s_lshl_b32 s[\dst], s[\src0], 16 | |||
s_or_b32 s[\dst], s[\tmp], s[\dst] | |||
.endm | |||
.macro bias_f base, bias, k, stmp0, stmp1 | |||
.macro bias_f base, k, stmp0, stmp1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please check line 1069
s_pack_ll_f \stmp, bias+k_off_1, bias+k_off_1, \stmp1
looks like it should be
s_pack_ll_f \stmp0, bias+k_off_1, bias+k_off_1, \stmp1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Also added arg_
prefixes to base
and k
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow nice find.
Wouldn't this have been broken before the .altmacro changes as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strange a bit, but these changes didn't alter anything in .text section in the output binary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, based on what I see it was just inactive code branch. In any other case, it would not be possible to change the number of macro arguments without harming the program logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I meant \stmp change didn't affect the binary, macro arguments did.
The Root Cause
.altmacro
support has been extended in ROCm 6.4 leading to macro arguments are expanded even if they are not preceded by a\
. Previously symbols not preceded by a\
were intended to be interpreted as global symbols rather than macro arguments. This is no longer a good practice.Compiling and Checking the Output Binaries
conv1x1u_bias_activ.s
There are 7 sets of compiling parameters:
Extracting sections:
/opt/rocm/llvm/bin/llvm-objdump --full-contents conv1x1u_bias_activ.o > conv1x1u_bias_activ.txt
xform_data.s
There are 5 sets of compiling parameters:
Extracting sections:
/opt/rocm/llvm/bin/llvm-objdump --full-contents xform_data.o > xform_data.txt
Conv_Winograd_Fury_v2_4_1_fp16_fp16acc_f2x3_c16_stride1.s
There are 2 sets of compiling parameters:
Extracting sections:
/opt/rocm/llvm/bin/llvm-objdump --full-contents Conv_Winograd_Fury_v2_4_1_fp16_fp16acc_f2x3_c16_stride1.o > Conv_Winograd_Fury_v2_4_1_fp16_fp16acc_f2x3_c16_stride1.txt
Conv_Winograd_Fury_v2_4_1_fp16_fp16acc_f2x3_c32_stride1.s
Compiling:
Extracting sections:
/opt/rocm/llvm/bin/llvm-objdump --full-contents Conv_Winograd_Fury_v2_4_1_fp16_fp16acc_f2x3_c32_stride1.o > Conv_Winograd_Fury_v2_4_1_fp16_fp16acc_f2x3_c32_stride1.txt