regarding the Inter-Pooling Mechanism #2

XiaoqiWang · 2023-10-24T04:51:29Z

Hello, I have a question regarding the handling of dimensions after Inter-Pooling. Specifically, the original batch size B becomes Bf^2 after the pooling operations, meaning each image's batch size is now f^2. The paper does not seem to explain in detail how this altered batch dimension is handled in subsequent operations，e.g. how batchsize is converted from Bf^2 to B while the spatial dimension is downsampled.

StriveZs · 2023-10-24T05:10:51Z

Hi Wang,
this Zander, thanks for your insightful comment and attention about our work.
For Q, K and V generated by the raw Q K V, we adopt two different strategiest to generate each of them .

First, for Q, we adopt the rerange operation to change the dimension of Q, e.g., [B, C, H ,W] -> [B f^2, C, H/f, W/f], to reserve the information without lossing
Second, for K and V, after downsampling the spatial dimension by convolution~(e.g. [B, C, H, W] -> [B, C, H/f, W/f]), we adopt the repeat operation to pull the dimension of K and V, e.g. [B, C, H/f, W/f] -> [B f^2, C, H/f, W/f], to increase the information dimension different from Q.

I hope my answer can alleviate your confusion, if you also question, please let me know. THX

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

regarding the Inter-Pooling Mechanism #2

regarding the Inter-Pooling Mechanism #2

XiaoqiWang commented Oct 24, 2023

StriveZs commented Oct 24, 2023

regarding the Inter-Pooling Mechanism #2

regarding the Inter-Pooling Mechanism #2

Comments

XiaoqiWang commented Oct 24, 2023

StriveZs commented Oct 24, 2023