Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

source code question #1

Open
lemonc1014 opened this issue Nov 30, 2022 · 3 comments
Open

source code question #1

lemonc1014 opened this issue Nov 30, 2022 · 3 comments

Comments

@lemonc1014
Copy link

Hello!
Will the source code be released? If so, is there a specific time

@zehuichen123
Copy link
Owner

Thanks for your attention to our work. Since the experiments were done on two different BEVFormer codebases (inner codebase at SenseTime for single-frame and official codebase for multi-frame), we are going to merge the code into the official BEVFormer before releasing the code. Please expect the code after ICCV ddl :)

@sujinjang
Copy link

Hello! While I'm waiting for the source code release, I got some questions on the modification of the teacher model. In the paper, you mentioned that DGCNN attention is replaced with a vanilla multi-scale attention module and a pre-trained CenterPoint weight is used for an initialization. Could you please provide little bit more details here? For example, did you simply change "DGCNNAttn" with "MultiheadAttention" from the original voxel config (https://github.com/WangYueFt/detr3d/blob/main/projects/configs/obj_dgcnn/voxel.py#L88)? Also, which CenterPoint weight did you use?

Another question is on the detail on the "BEV Feature" which is extracted from the teacher model. The original implementation of obj_dgcnn inputs multi-scale features as like [128x128], [64x64], [32x32], [16x16] as a flattened feature of [21760x256] to the transformer_encoder ("DetrTransformerEncoder", https://github.com/WangYueFt/detr3d/blob/main/projects/configs/obj_dgcnn/voxel.py#L71). Then the transformer_encoder also outputs the memory of [21760x256]. How did you extract bev_feature [bev_w x bev_w x 256] from here? This seems to the most important part of the teacher model.

@Li-Whasaka
Copy link

Has the code been released yet?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants