Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to characterize the uncertainties of the best solutions? #283

Open
Landau1908 opened this issue Nov 20, 2024 · 5 comments
Open

How to characterize the uncertainties of the best solutions? #283

Landau1908 opened this issue Nov 20, 2024 · 5 comments

Comments

@Landau1908
Copy link

Hi,
Is std returned from 'es.result` corresponding to the 1 sigma error of the best solutions ? What's the idea/algorithm to extract the uncertainties?
Regards

@nikohansen
Copy link
Contributor

nikohansen commented Nov 20, 2024

I can't say out of my head, it is likely to depend on the population size too. If need be, I would run a bunch of experiments on a small set of functions to find (a model for) the relationship between, for example, es.stds and the error interval where to expect the global optimum.

Specifically, even better (EDIT: or maybe not), we probably want to find the increasing function
$\alpha\mapsto P(\parallel m - x^* \parallel_{C^{-1}} \le \alpha)$ depending on dimension and population size, where es.mahalanobis_norm(es.mean - xopt) computes $\parallel m - x^* \parallel_{C^{-1}}$ (encompassing the step-size too). To achieve this, as a first step, I would track $\parallel m - x^* \parallel_{C^{-1}}$ over time and look at the graphs. If the distribution is stationary, we can determine the empirical CDF of this distance which is, if I am not mistaken, a consistent estimator of the above function. This, however, doesn't give us immediately an error interval on each parameter. For this, we would be interested in the distribution of np.abs(es.mean - xopt) / es.stds.

EDIT: Some quick and dirty experiments suggest that in the ideal scenario (ellipsoid function) es.sp.weights.mueff**0.5 * np.abs(es.mean - xopt) / es.stds is very likely below five, that is, $|m_i - x^*_i| < \sigma_i \frac{5}{ \sqrt{\mu_\mathrm{eff}}}$.

@nikohansen
Copy link
Contributor

nikohansen commented Nov 21, 2024

In the stationary distribution with ellipsoidal level sets, for dimension $n$ and $\mu_\text{eff}$ in the range between 2 and 100, I seem to observe that

$\displaystyle \forall \beta\ge0: P\left(|m_i - x_i^*| > (1+\beta)\alpha\,\sigma_i \right) < 20^{-(\beta + 0.7)} < 20^{-\beta} < 10^{-\beta}$

with $\alpha = \sqrt{\frac{n^{2/3}}{{\mu_\text{eff}}}}$.

Specifically (for $\beta=1, 4$), we get $P\left(|m_i - x_i^*| > 2\alpha\,\sigma_i \right) < 10^{-2}$ and $P\left(|m_i - x_i^*| > 5\alpha\,\sigma_i \right) < 10^{-6}$. However in practice, we can hardly ever have a certainty of $1 - 10^{-6}$ to be in this stationary distribution.

@Landau1908
Copy link
Author

with α = n 2 / 3 μ eff .

Specifically (for β = 1 , 4 ), we get P ( | m i − x i ∗ | > 2 α σ i ) < 10 − 2 and P ( | m i − x i ∗ | > 5 α σ i ) < 10 − 6 . However in practice, we can hardly ever have a certainty of 1 − 10 − 6 to be in this stationary distribution.

What dose the relations between CMA ellipsoid distribution and standard Gaussian distribution? For Gaussian distribution, the values lies in the $$\pm 3\sigma$$ have the possibility $$P(-3 \sigma&lt;(x-\mu) &lt;3 \sigma)&gt;99.7%$$, which gives the standard uncertaity representation like as $x=\mu \pm 3\sigma$. For CMA, how to do this?

@nikohansen
Copy link
Contributor

with α = n 2 / 3 μ eff .
Specifically (for β = 1 , 4 ), we get P ( | m i − x i ∗ | > 2 α σ i ) < 10 − 2 and P ( | m i − x i ∗ | > 5 α σ i ) < 10 − 6 . However in practice, we can hardly ever have a certainty of 1 − 10 − 6 to be in this stationary distribution.

What dose the relations between CMA ellipsoid distribution and standard Gaussian distribution? For Gaussian distribution, the values lies in the ± 3 σ have the possibility P ( − 3 σ < ( x − μ ) < 3 σ ) > 99.7 , which gives the standard uncertaity representation like as x = μ ± 3 σ . For CMA, how to do this?

The question seems to assume that the distance to the optimum follows this (or a Gaussian) distribution, which is very unlikely to be the case.

@nikohansen
Copy link
Contributor

nikohansen commented Dec 5, 2024

These are the data from the final stationary distribution of $|m_i - x^*_i| / \sigma_i$ for the sphere function $x\mapsto ||x||^2$ from which the above claim was derived.
final-confidence-interval-error-probability-sphere2

For the rotated Ellipsoid function, the graphs look similar when the dimension is small enough (the dependency on $n$ may be slightly different). In higher dimensions, termination is triggered before the stationary distribution is reached and I observed distances up to a 100 times larger (and I suspect that this factor could be highly sensitive to subtleties in the experimental conditions).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants