Bug Report: CR and CG Solvers Fail with Float16 Precision #968

farhadrclass · 2025-02-19T17:07:49Z

Description

When solving a symmetric positive definite system using Krylov.jl with Float16 arithmetic, the Conjugate Residual (CR) and Conjugate Gradient (CG) methods exhibit numerical issues. In our tests, the CR solver returns a vector of NaN values after one iteration, while the CG solver completes after five iterations but reports "Inf" in intermediate quantities and a status of "zero curvature detected." This behavior suggests that the solvers may not handle the low precision of Float16 correctly.
When the linesearch is true.

However my main concern is that the behavior should be similar and CR it returns NAN

Steps to Reproduce

Run the following code in a Julia session, if you switch to Float64 it works fine:

# Test CR and CG with Float16
T = Float16 
ϵ = eps(T)^T(1 / 4)
rtol = ϵ 
function symmetric_definite(n::Int = 10; FC = Float64)
  α = FC <: Complex ? FC(im) : one(FC)
  A = spdiagm(-1 => α * ones(FC, n-1), 0 => 4 * ones(FC, n), 1 => conj(α) * ones(FC, n-1))
  b = A * FC[1:n;]
  return A, b
end

# Create a symmetric, positive definite system using Float16
A, b = symmetric_definite(FC = T)

# Define tolerances (ensure that rtol and ϵ are defined appropriately for Float16)
subtol = max(rtol, min(T(0.1), √T(1.3e+01), T(0.9) * one(T)))

# Test the CR solver
(x, stats) = cr(A, b, atol = ϵ, rtol = subtol, verbose = 1, linesearch = true)
println("CR result:")
println(x, stats)

# Test the CG solver
(x, stats_cg) = cg(A, b, atol = ϵ, rtol = subtol, verbose = 1, linesearch = true)
println("CG result:")
println(x, stats_cg)

Observed Output

CR Output:


CR: system of 10 equations in 10 variables
    k       ‖x‖       ‖r‖      quad  timer
    0   0.0e+00   1.1e+02   0.0e+00  0.00s
    1       NaN       NaN       NaN  0.00s
CR result:
Float16[NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN]SimpleStats
 niter: 1
 solved: false
 inconsistent: false
 residuals: []
 Aresiduals: []
 κ₂(A): []
 timer: 1.13ms
 status: solver encountered numerical issues

CG Output:

CG result:
Float16[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]SimpleStats
 niter: 5
 solved: true
 inconsistent: false
 residuals: []
 Aresiduals: []
 κ₂(A): []
 timer: 7.04ms
 status: zero curvature detected

Expected Behavior

For a symmetric positive definite system, both the CR and CG methods should converge to a valid solution (with finite values) within the specified tolerances. Although Float16 has limited precision, one would still expect a meaningful solution rather than immediate NaN or Inf results.

Environment

Krylov.jl Version: (e.g., v0.9.10)
Julia Version: (e.g., 1.10+)
Operating System: Windows
Data Type: Float16

The text was updated successfully, but these errors were encountered:

farhadrclass · 2025-02-19T17:08:41Z

@amontoison and @dpo , one solution I have is to make sure CR behave similar to CG in float16

Also we need to add low floating point to tests

farhadrclass · 2025-02-19T18:11:26Z

The issue is that p and hence pAp would be Inf16 here

Krylov.jl/src/cr.jl

Line 169 in f411451

ρ = kdotr(n, r, Ar)

r= Float16[6.0, 12.0, 18.0, 24.0, 30.0, 36.0, 42.0, 48.0, 54.0, 49.0]
Ar=Float16[36.0, 72.0, 108.0, 144.0, 180.0, 216.0, 252.0, 288.0, 313.0, 250.0]
p = Krylov.kdotr(10, r, Ar)

would be
Inf16

dpo · 2025-02-20T04:41:00Z

And that seems to be the correct answer as dot overflows here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug Report: CR and CG Solvers Fail with Float16 Precision #968

Bug Report: CR and CG Solvers Fail with Float16 Precision #968

farhadrclass commented Feb 19, 2025

farhadrclass commented Feb 19, 2025

farhadrclass commented Feb 19, 2025

dpo commented Feb 20, 2025

Bug Report: CR and CG Solvers Fail with Float16 Precision #968

Bug Report: CR and CG Solvers Fail with Float16 Precision #968

Comments

farhadrclass commented Feb 19, 2025

Description

Steps to Reproduce

Observed Output

Expected Behavior

Environment

farhadrclass commented Feb 19, 2025

farhadrclass commented Feb 19, 2025

dpo commented Feb 20, 2025