Arithmetic optimizations #550

jibril-b-coulibaly · 2025-02-23T22:03:07Z

Hello,

I am developing a new module and taking inspiration from the existing code.
I notice many instances of floating point operations that I find unusual in terms of readability, code duplication, or performance, e.g.,

square roots using pow(x, 0.5) and cubic root using pow(x, 1. / 3.) instead of sqrt(x) and cbrt(x)

chrono/src/chrono_modal/ChModalAssembly.cpp

Line 710 in 09caa3e

double static_scaling_factor = std::pow(expected_generalized_mass / M_rr(0, 0), 0.5);

powers using pow(x, N) instead of x*x*...

chrono/src/chrono/fea/ChBeamSectionCosserat.cpp

Lines 154 to 158 in 09caa3e

    
           double cos_alpha = std::cos(alpha); 
        
           double sin_alpha = std::sin(alpha); 
        
           double a11 = E * A; 
        
           double a22 = E * (Iyy * std::pow(cos_alpha, 2.) + Izz * std::pow(sin_alpha, 2.) + Cz * Cz * A); 
        
           double a33 = E * (Izz * std::pow(cos_alpha, 2.) + Iyy * std::pow(sin_alpha, 2.) + Cy * Cy * A);

divisions

chrono/src/chrono/core/ChTensors.h

Line 186 in 09caa3e

double GetEquivalentMeanHydrostatic() const { return (this->GetInvariant_I1() / 3.); }

For 1, I proposed arithmetic modifications for readability and performance #549
For 2. although many modern compilers are usually able to eliminate repeated sub-expressions, or optimize std::pow for integer exponents, I found GCC to be much slower for an operation like std::pow(x, 3) compared to x*x*x, even with flag -O3 (e.g., my Ubuntu Release build of Chrono with GCC 13.3.0, and tests on godbolt I could share if there is interest. Only tests with flag -ffast-math , which allows non-compliant code, showed negligible difference).
- What are the motivations behind the heavy use of pow throughout the code base? Have benchmarks ruled out possible performance hits compared to bare multiplications?
For 3. compilers likely won't replace the division by a reciprocal and faster multiplication for values that are not powers of 2 (for the same standard-compliant reasons discussed above).
- Is there a desire to replace expressions such as x / 3. by multiplications with constants that can be evaluated at compile time, e.g., x * (1. / 3.) (I saw plenty of those throughout the code) ? If so, would defining such constant expressions (e.g. CH_1_3, similar to e.g. CH_PI in ChConstants.h) be helpful for readability, and systematic replacement of x / 3. by x * CH_1_3 preferred?

I appreciate your perspective and insight.
I would be more than happy to take on these tasks.

Jibril

The text was updated successfully, but these errors were encountered:

rserban · 2025-02-24T08:08:49Z

Jibril, a couple of comments:

using sqrt and cbrt where applicable is certainly preferable, both for performance and readability.
yes, those could/should be replaced (especially for squaring). For higher powers, it's a question of readability and preference. I personally tend to stay away from using pow, unless in some initialization function that is called once or very seldomly. Some of us are more used to using pow and/or don't immediately think about performance (of course, there's also the danger of premature optimization).
this also could be implemented, but:
- it could quickly get out of hand if we decide to do this for a ton of cases (1/3 is probably common enough to justify it)
- we have code that can be configured in single or double precision; that may require two constants for each value.

I hope this makes sense. If so, it'd be great if you added some of 2 and 3 to your PR. One issue with this kind of changes is that we always have many balls in the air in various feature branches (some of them quite long-lived), so there's always the issue of merge conflicts. But that's something we can deal with.

rserban · 2025-02-24T08:11:55Z

On a different note: you mention working on a new module. It'd be great if you could give us a bit more details (we can take this offline if needed). The reasons I ask are:

we should make sure this is not something someone is already working on or something we plan on doing.
with any new module, there's the question of long-term maintenance and support.

jibril-b-coulibaly · 2025-02-24T18:18:51Z

@rserban thank you for a prompt and informative response.

With regards to the module, I recently joined a team who works on lattice methods for mesoscopic materials modeling (e.g., LDPM for concrete https://github.com/Concrete-Chrono-Development/chrono-concrete).
Most of the developments are similar to fea capabilities (elements and local constitutive material models) and might in principle be integrated as extensions of fea.
However, because they form part of a whole and might not make sense as standalone elements that users might need, they are currently being developed inside an isolated submodule. I believe there is a desire to eventually merge that code into chrono, at which point we might consider re-organizing our code if you think some capabilities would be worth having in core.

With regards to 3. I see that BULLET (cbtScalar), FSI (Real) have float/double type definitions conditional on configuration, did I miss any other?
In these cases, they simply convert other types (are there Chrono rules against implicit conversions? In the cases below it might improve readability)

chrono/src/chrono/collision/bullet/BulletCollision/CollisionShapes/cbtBoxShape.cpp

Line 38 in 09caa3e

cbtScalar lx = cbtScalar(2.) * (halfExtents.x());

chrono/src/chrono_fsi/sph/ChFluidSystemSPH.cpp

Line 96 in 09caa3e

m_paramsH->d0 = Real(0.01);

which makes me think adding double constants should cover most cases.

I will go on and add some of these changes to #549

Thank you

jibril-b-coulibaly mentioned this issue Feb 23, 2025

arithmetic modifications for readability and performance #549

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Arithmetic optimizations #550

Arithmetic optimizations #550

jibril-b-coulibaly commented Feb 23, 2025

rserban commented Feb 24, 2025

rserban commented Feb 24, 2025

jibril-b-coulibaly commented Feb 24, 2025

Arithmetic optimizations #550

Arithmetic optimizations #550

Comments

jibril-b-coulibaly commented Feb 23, 2025

rserban commented Feb 24, 2025

rserban commented Feb 24, 2025

jibril-b-coulibaly commented Feb 24, 2025