Fixing NaN sampling for von Mises Fisher distribution #1918

williamjsdavis · 2024-11-20T05:48:31Z

Summary of contribution

I have made a change to the sampling algorithm of von Mises Fisher distributions to avoid NaN outputs.

The problem

Sampling from certain von Mises Fisher distributions will result in NaN. For example:

using Distributions
using LinearAlgebra: norm

κ = 1.0
μ = [1.0,1e-8]
μ = μ ./ norm(μ)
d = VonMisesFisher(μ, κ)
rand(d)

...returns:

2-element Vector{Float64}:
 NaN
 NaN

The error happens because of operations performed in the function _vmf_householder_vec:

Distributions.jl/src/samplers/vonmisesfisher.jl

Lines 87 to 102 in 3de6038

    
           function _vmf_householder_vec(μ::Vector{Float64}) 
        
               # assuming μ is a unit-vector (which it should be) 
        
               #  can compute v in a single pass over μ 
        
               p = length(μ) 
        
               v = similar(μ) 
        
               v[1] = μ[1] - 1.0 
        
               s = sqrt(-2*v[1]) 
        
               v[1] /= s 
        
               @inbounds for i in 2:p 
        
                   v[i] = μ[i] / s 
        
               end 
        
               return v 
        
           end

The problem arises when μ is a vector exactly in the direction of the first dimension, or very close to the first dimension. (For example, μ = [1.0, 0.0], μ = [1.0, 0.0, 0.0], or μ = [1.0, 0.0, 1e-8], etc...) In these cases, variable s will become zero, leading vector v to become filled with NaNs.

This issue was previously raised in #1423

Proposed solution

I propose a solution, where I have added a small epsilon value to the denominators in _vmf_householder_vec. This approach is common in machine learning algorithms like ADAM, where dividing by zero is avoided during accumulation:

using Distributions
using LinearAlgebra: norm

κ = 1.0
μ = [1.0,1e-8]
μ = μ ./ norm(μ)
d = VonMisesFisher(μ, κ)
rand(d)

...returns:

2-element Vector{Float64}:
 0.8976253988301388
 0.4407591670913201

... which are not NaNs!

Discussion

I have the following discussion topics:

I'm open to feedback regarding whether this PR is going in the right direction. Is a change here desired? Is my proposed solution the right way forward?
If so, what additional contributions would be needed? For example, I'd assume that we'd also like to add some tests to verify my change is not causing unintended changes, and to check for future NaN outputs.

As always, feedback is appreciated

williamjsdavis · 2025-02-05T06:02:10Z

As discussed in #1930, the numerical approach workaround presented in this PR is not ideal.

For posterity sake, the reason why this PR does not pass the tests introduced by @devmotion is because the epsilon value added to the denominator affects the vectors quite far from the singularity point. This causes a lot of cases where the norm of the vector is no longer close to 1.

  Expression: norm(X[:, i]) ≈ 1.0
   Evaluated: 0.9999998814346462 ≈ 1.0

The same result can be illustrated as follows:

Here I am plotting the 2 components of von-Mieses Householder vector as a function of t for for μ=[x(t),y(t)]. The original implementation is shown in solid colors; my attempted implementation is in dashed. The NaN values of the original implementation are shown as transparent vertical lines. As one can see, my version of v2 is quite less than ±1 away from t=0.

The same plot showing @devmotion's implementation looks like this:

devmotion mentioned this pull request Dec 16, 2024

Fix von Mises-Fisher sampler #1930

Open

Add epsilon to VonMisesFisher Householder summation

2efd649

williamjsdavis force-pushed the von-mieses-fisher-fix branch from ce775d2 to 2efd649 Compare February 5, 2025 04:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixing NaN sampling for von Mises Fisher distribution #1918

Fixing NaN sampling for von Mises Fisher distribution #1918

williamjsdavis commented Nov 20, 2024

williamjsdavis commented Feb 5, 2025

	function _vmf_householder_vec(μ::Vector{Float64})
	# assuming μ is a unit-vector (which it should be)
	# can compute v in a single pass over μ

	p = length(μ)
	v = similar(μ)
	v[1] = μ[1] - 1.0
	s = sqrt(-2*v[1])
	v[1] /= s

	@inbounds for i in 2:p
	v[i] = μ[i] / s
	end

	return v
	end

Fixing NaN sampling for von Mises Fisher distribution #1918

Are you sure you want to change the base?

Fixing NaN sampling for von Mises Fisher distribution #1918

Conversation

williamjsdavis commented Nov 20, 2024

Summary of contribution

The problem

Proposed solution

Discussion

williamjsdavis commented Feb 5, 2025