Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regarding the Inter-Pooling Mechanism #2

Open
XiaoqiWang opened this issue Oct 24, 2023 · 1 comment
Open

regarding the Inter-Pooling Mechanism #2

XiaoqiWang opened this issue Oct 24, 2023 · 1 comment

Comments

@XiaoqiWang
Copy link

Hello, I have a question regarding the handling of dimensions after Inter-Pooling. Specifically, the original batch size B becomes Bf^2 after the pooling operations, meaning each image's batch size is now f^2. The paper does not seem to explain in detail how this altered batch dimension is handled in subsequent operations,e.g. how batchsize is converted from Bf^2 to B while the spatial dimension is downsampled.

@StriveZs
Copy link
Owner

Hi Wang,
this Zander, thanks for your insightful comment and attention about our work.
For Q, K and V generated by the raw Q K V, we adopt two different strategiest to generate each of them .

  • First, for Q, we adopt the rerange operation to change the dimension of Q, e.g., [B, C, H ,W] -> [B f^2, C, H/f, W/f], to reserve the information without lossing
  • Second, for K and V, after downsampling the spatial dimension by convolution~(e.g. [B, C, H, W] -> [B, C, H/f, W/f]), we adopt the repeat operation to pull the dimension of K and V, e.g. [B, C, H/f, W/f] -> [B f^2, C, H/f, W/f], to increase the information dimension different from Q.

I hope my answer can alleviate your confusion, if you also question, please let me know. THX

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants