-
Notifications
You must be signed in to change notification settings - Fork 454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SIMD back-end for IBM Power and Z #195
Comments
it'd be interesting to hear if there is interest from other users. feel free to use the gemmlowp 'google group' to reach more people. |
Thank you @bjacob for the reply. Our interest is mainly driven by the use of gemmlowp for quantized models in TensorFlow Lite. We want to have fast inference on Power and Z CPUs. It was an interesting exercise to code up matrix multiplication in Power and Z vector intrinsics. It also helps us to evaluate our SIMD instruction sets. We would like to share our work with the larger community. Pushing it back into mainstream gemmlowp source would simplify a dedicated TF Lite build for us. Is Google's roadmap to drop gemmlowp and adopt "ruy" for TF Lite? Then we might want to have a look at it and consider whether to port our work there. I am curious to hear more from others. |
TFLite has already switched to ruy on arm64. There is work underway on arm32 and x86. However given the complex landscape of inference backends at the moment it's hard to make guesses as to what tflite will end up using. Over the next few months these things should settle a bit. |
This is not really an issue. I'd like to know if there is an interest in incorporating code to support IBM's Power and Z architectures as a back-end. In-house me and a colleague actually worked on this and we have extensions ready for gemmlowp to run optimized on P and Z depending on compiler flags when these architectures are detected. In principle this does not touch or disrupt any of the existing code.
Please comment on this issue and provide advice as to how best proceed.
The text was updated successfully, but these errors were encountered: