Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fast multiply-add #5

Open
ralphtandetzky opened this issue May 26, 2024 · 0 comments
Open

Fast multiply-add #5

ralphtandetzky opened this issue May 26, 2024 · 0 comments
Labels
enhancement Feature enhancement

Comments

@ralphtandetzky
Copy link
Owner

Feature description
If fused-multiply-add (FMA) is supported by the hardware platform, then it should be used.

Describe the solution you'd like
If possible detect at compile-time whether the hardware target supports fused-multiply-add. Use it, if it's there. Otherwise, retreat to a multiply then add implementation, so there won't be a considerable slow-down.

Describe alternatives you've considered
The problem of the above approach is that we'll get different results on different platforms. The behavior is implementation defined. This can be a problem when reproducibility is important. An other approach would be to have the user switch between the alternatives by using compile features.

@ralphtandetzky ralphtandetzky added the enhancement Feature enhancement label May 26, 2024
@ralphtandetzky ralphtandetzky self-assigned this May 26, 2024
@ralphtandetzky ralphtandetzky removed their assignment Sep 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Feature enhancement
Projects
None yet
Development

No branches or pull requests

1 participant