Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Official Review #1

Open
micronet-challenge-submissions opened this issue Oct 23, 2019 · 10 comments
Open

Official Review #1

micronet-challenge-submissions opened this issue Oct 23, 2019 · 10 comments

Comments

@micronet-challenge-submissions
Copy link
Collaborator

micronet-challenge-submissions commented Oct 23, 2019

Hello! Thanks so much for your entry! We've successfully evaluated your checkpoint and the quality checks out! And we'd like to say that we greatly appreciate the organization and quality of the code.

One question on your quantization scoring: In your report you say that you count the additions and multiplications separately, but in flop_counter.py it looks like you sum them together and scale both by the reduced precision factor:

Linear/Conv Counting:
https://github.com/yashbhalgat/QualcommAI-MicroNet-submission-MixNet/blob/master/lsq_quantizer/flops_counter.py#L286

https://github.com/yashbhalgat/QualcommAI-MicroNet-submission-MixNet/blob/master/lsq_quantizer/flops_counter.py#L346

Quantization Scaling:

mod_flops = module.__flops__*max(w_str[quant_idx], a_str[quant_idx])/32.0

Am I understanding this correct? It looks like you're properly rounding the weights and activations prior to each linear operation during evaluation, but the additions in these kernels should be counted as FP32 unless I'm missing something.

Trevor

@yashbhalgat
Copy link
Owner

Hi Trevor,
Thanks for the comment, I tried my best to keep the code well organized and reproducible. :)

Your understanding of the code is correct. We scale both the addition and multiplication operations inside the kernel by the reduced precision factor. This is because - All the input activations and weights to the kernel are quantized to 8-bits and hence, all the operations in the kernel (both multiplication and addition) are done in 8-bits. This can be explained by this diagram better (taken from here):
image

All the matrix operations (even the additions) are performed in low precision before the output is multiplied again by the scaling factor(s).

Thanks for all the hardwork you are putting into reviewing the submissions. Highly appreciate it.

Yash

@micronet-challenge-submissions
Copy link
Collaborator Author

Hi Yash,

Perhaps I'm missing something. AFAICT the paper doesn't mention doing reduced precision accumulation. Given the variable bit-width results, they're likely doing "fake quantization" to evaluate the approach, in which case the additions are full width (32-bits, I assume).

If both the inputs and activations to the kernel are rounded to 8-bit then the multiplications can be counted as being performed in 8-bit. However, after the multiplications the result will be a full 32-bit floating point value, and all accumulation will be performed in 32-bit floating-point.

Trevor

@yashbhalgat
Copy link
Owner

Hi Trevor,

Your analysis is correct. The accumulation is not happening in reduced precision, so I will have to calculate the additions in 32 bit. Sorry for the confusion.

I assume that this will change the score of all my submissions. After a rough calculation, the score of this submission increases from 0.181 to 0.31. I will update my flops_counting code and update the scores of the submissions by today midnight. Is that ok?

Thanks,
Yash

@micronet-challenge-submissions
Copy link
Collaborator Author

Hi Yash,

Yes, that sounds good. Thanks for being so responsive!

Trevor

@yashbhalgat
Copy link
Owner

Hi Trevor,

I have fixed the bug in flops_counting script in all my submissions. I have also updated the scores in the reports of all my submissions. The updated scores of the submissions are as follows:

Submission Track Score Links
QualcommAI-EfficientNet ImageNet track 0.3789 link
QualcommAI-MixNet ImageNet track 0.2968 link
QualcommAI-nanoWRN CIFAR100 track 0.0720 link

If you remember, I made a submission much earlier named "QualcommAI-M0" which is currently on the leaderboard. Since the scores of my later submissions are much better, I would like to scrap the submission QualcommAI-M0. (Honestly, this is because I don't have the bandwidth currently to correct the score of that submission.)

Let me know if you have any more questions. :)
Yash

@micronet-challenge-submissions
Copy link
Collaborator Author

Hi Yash,

Thanks so much for the updates! Everything about this submissions checks out now.

With regard to your M0 entry, if you get a chance to update it in the next few days we'd be happy to review it still (it's very little overhead since we have 3 others from you as well). Just let us know!

Trevor

@yashbhalgat
Copy link
Owner

Hi Trevor,

Sure thing, I will update it by today midnight. You can check back on that submission (link) tomorrow morning. :)

@micronet-challenge-submissions
Copy link
Collaborator Author

@yashbhalgat
Copy link
Owner

Hi Trevor,
I have updated the QualcommAI-M0 (link) submission with the corrected score. The score for that submission is now 0.5488.

Thanks,
Yash

@micronet-challenge-submissions
Copy link
Collaborator Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants