Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arithmetic optimizations #550

Open
jibril-b-coulibaly opened this issue Feb 23, 2025 · 3 comments
Open

Arithmetic optimizations #550

jibril-b-coulibaly opened this issue Feb 23, 2025 · 3 comments

Comments

@jibril-b-coulibaly
Copy link

Hello,

I am developing a new module and taking inspiration from the existing code.
I notice many instances of floating point operations that I find unusual in terms of readability, code duplication, or performance, e.g.,

  1. square roots using pow(x, 0.5) and cubic root using pow(x, 1. / 3.) instead of sqrt(x) and cbrt(x)

    double static_scaling_factor = std::pow(expected_generalized_mass / M_rr(0, 0), 0.5);

  2. powers using pow(x, N) instead of x*x*...

    double cos_alpha = std::cos(alpha);
    double sin_alpha = std::sin(alpha);
    double a11 = E * A;
    double a22 = E * (Iyy * std::pow(cos_alpha, 2.) + Izz * std::pow(sin_alpha, 2.) + Cz * Cz * A);
    double a33 = E * (Izz * std::pow(cos_alpha, 2.) + Iyy * std::pow(sin_alpha, 2.) + Cy * Cy * A);

  3. divisions

    double GetEquivalentMeanHydrostatic() const { return (this->GetInvariant_I1() / 3.); }

  • For 1, I proposed arithmetic modifications for readability and performance #549

  • For 2. although many modern compilers are usually able to eliminate repeated sub-expressions, or optimize std::pow for integer exponents, I found GCC to be much slower for an operation like std::pow(x, 3) compared to x*x*x, even with flag -O3 (e.g., my Ubuntu Release build of Chrono with GCC 13.3.0, and tests on godbolt I could share if there is interest. Only tests with flag -ffast-math , which allows non-compliant code, showed negligible difference).

    • What are the motivations behind the heavy use of pow throughout the code base? Have benchmarks ruled out possible performance hits compared to bare multiplications?
  • For 3. compilers likely won't replace the division by a reciprocal and faster multiplication for values that are not powers of 2 (for the same standard-compliant reasons discussed above).

    • Is there a desire to replace expressions such as x / 3. by multiplications with constants that can be evaluated at compile time, e.g., x * (1. / 3.) (I saw plenty of those throughout the code) ? If so, would defining such constant expressions (e.g. CH_1_3, similar to e.g. CH_PI in ChConstants.h) be helpful for readability, and systematic replacement of x / 3. by x * CH_1_3 preferred?

I appreciate your perspective and insight.
I would be more than happy to take on these tasks.

Jibril

@rserban
Copy link
Member

rserban commented Feb 24, 2025

Jibril, a couple of comments:

  1. using sqrt and cbrt where applicable is certainly preferable, both for performance and readability.
  2. yes, those could/should be replaced (especially for squaring). For higher powers, it's a question of readability and preference. I personally tend to stay away from using pow, unless in some initialization function that is called once or very seldomly. Some of us are more used to using pow and/or don't immediately think about performance (of course, there's also the danger of premature optimization).
  3. this also could be implemented, but:
    • it could quickly get out of hand if we decide to do this for a ton of cases (1/3 is probably common enough to justify it)
    • we have code that can be configured in single or double precision; that may require two constants for each value.

I hope this makes sense. If so, it'd be great if you added some of 2 and 3 to your PR. One issue with this kind of changes is that we always have many balls in the air in various feature branches (some of them quite long-lived), so there's always the issue of merge conflicts. But that's something we can deal with.

@rserban
Copy link
Member

rserban commented Feb 24, 2025

On a different note: you mention working on a new module. It'd be great if you could give us a bit more details (we can take this offline if needed). The reasons I ask are:

  • we should make sure this is not something someone is already working on or something we plan on doing.
  • with any new module, there's the question of long-term maintenance and support.

@jibril-b-coulibaly
Copy link
Author

@rserban thank you for a prompt and informative response.

With regards to the module, I recently joined a team who works on lattice methods for mesoscopic materials modeling (e.g., LDPM for concrete https://github.com/Concrete-Chrono-Development/chrono-concrete).
Most of the developments are similar to fea capabilities (elements and local constitutive material models) and might in principle be integrated as extensions of fea.
However, because they form part of a whole and might not make sense as standalone elements that users might need, they are currently being developed inside an isolated submodule. I believe there is a desire to eventually merge that code into chrono, at which point we might consider re-organizing our code if you think some capabilities would be worth having in core.

With regards to 3. I see that BULLET (cbtScalar), FSI (Real) have float/double type definitions conditional on configuration, did I miss any other?
In these cases, they simply convert other types (are there Chrono rules against implicit conversions? In the cases below it might improve readability)

cbtScalar lx = cbtScalar(2.) * (halfExtents.x());

m_paramsH->d0 = Real(0.01);

which makes me think adding double constants should cover most cases.

I will go on and add some of these changes to #549

Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants