Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can theseus return negative cost? #501

Open
Jeff09 opened this issue Apr 27, 2023 · 13 comments
Open

Can theseus return negative cost? #501

Jeff09 opened this issue Apr 27, 2023 · 13 comments

Comments

@Jeff09
Copy link

Jeff09 commented Apr 27, 2023

❓ Questions and Help

Hi theseus team,

I'm currently using theseus to solve some non-linear optimization problem. In my case, some cost function will return negative cost. However, in the objective, it'll default calculate the square of cost, making the negative gradient direction become the positive one. I wonder if there's any methods to handle this case.

Thank you so much.

@luisenp
Copy link
Contributor

luisenp commented Apr 28, 2023

Hi @Jeff09. Most of our optimizations methods assume a nonlinear least squares formulation, and won't work with a different cost function aggregation metric. That being said, you can try our differentiable CEM solver, which supports other error metrics. You can change the default sum of squares by passing a different value here.

Let me know if this helps.

@Jeff09
Copy link
Author

Jeff09 commented Apr 29, 2023

Hi @luisenp , Thank you for your advice.

I have tried to change the default sum of squares to just sum of error vector. I can get the negative cost by using the sum vector function. However, it seems the optimizer is not working as it should be. The optimizer is not decreasing the cost but increasing the cost. By the way, I'm using LevenbergMarquardt optimizer. Here's the error example.
This's the error printed in the 0th iteration.
image

The error after 1st iteration is following.
image
image

I have two questions regarding this.

  1. Why is the optimizer not working as lowering the cost?
  2. Isn't the printed error the sum of error vector when the cost weight is set to 1?

Thank you so much.

@luisenp
Copy link
Contributor

luisenp commented Apr 29, 2023

As I mentioned above, these optimizers are meant for nonlinear sum of squares problems. The only optimizer in our library that can handle other types of metrics is DCEM.

BTW, you should have gotten this error when you tried to use LevenbergMarquardt. Did this not happen?

@Jeff09
Copy link
Author

Jeff09 commented Apr 29, 2023

There's not error happened when I tried LevenbergMarquardt .

I have tried DCEM and it looks not optimization happens. It only shows the 0th iteration error and not any further iterations error printed even though I have already set verbose=True.

Here's the code I tried DCEM.

th.TheseusLayer(
optimizer=th.DCEM(
self.objective,
linear_solver_cls=th.CholmodSparseSolver,
vectorize=False,
max_iterations=max_iter,
step_size=step_size,
),

@luisenp
Copy link
Contributor

luisenp commented May 1, 2023

Are you on the latest version of Theseus? You are not getting any output at all? Also Also, note that the keywords for DCEM are different:

image

Can you share a short example that reproduces this behavior?

@Jeff09
Copy link
Author

Jeff09 commented May 2, 2023

@luisenp Here's the output using DCEM.
image

It only has 2 steps optimization when I change the error_metric_fn from the default metric to error_sum_fn in the th.Objective.

@luisenp
Copy link
Contributor

luisenp commented May 3, 2023

Ah, there was a bug in DCEM affecting negative costs. Fixed in #510. Can you check if with this fix it works for your use case now? Thanks for reporting this!

@Jeff09
Copy link
Author

Jeff09 commented May 4, 2023

Hi @luisenp, thanks for the quick fix. It works now and can generate some optimization after 50 iterations. However, it looks not easy to find a good solution after i have tried different hyper parameters. Here's the sample err.
image

It looks the error variance is too large to converge. Could you give me some advices on how to use DCEM?

@luisenp
Copy link
Contributor

luisenp commented May 4, 2023

Are your cost functions bounded below?

@mhmukadam
Copy link
Contributor

Perhaps @dishank-b @bamos can also provides some tips for using DCEM.

@dishank-b
Copy link
Contributor

@Jeff09 the error seems to be still going down, can you just try with higher max_iterations? You can try with higher n_elite as well. Also make sure init_sigma is big enough that required solution is within +2*sigma of initialize value of variables.

@Jeff09
Copy link
Author

Jeff09 commented May 5, 2023

Thank you for your tips using DCEM.

I have increase max_iterations from 50 to 100 and also using n_elite=10, init_sigma=5.0. However the error does not looks like having the right optimization direction. In the first few iterations, the error increases a lot and then decreasing. At the end, it looks not have much optimizations. When going to next point, the 0th iteration has way too much error.
image

@luisenp
Copy link
Contributor

luisenp commented May 5, 2023

@Jeff09 The error seems to be converging, even if slowly. DCEM is a random method, so it's not surprising that the error can increase between iterations; but by the 100th iteration your image shows that it's definitely much lower than the initial error. You should play with the hyperparameters a bit to see what works best for your application. Lowering n_elite might help converge faster, at the cost of potentially worse solution quality. (@dishank-b is this correct?) If you are not concerned about run time, maybe you should also increase n_sample, specially if your problem has a lot of optimization variables.

Now, regarding the comment of going to the next point, the expected behavior between different calls to the optimizer is application specific. Some questions I would consider:

  • Are the parameters of the objective the same as before? If not, what changes?
  • Are the solutions for two consecutive points expected to be related in some way?
  • Are you passing new initial values for the optimization variables?
  • Are you using the previous solution to initialize the optimization variables in the next iteration?

Overall, countess parameters can affect what happens between different calls to the optimizer, and unfortunately there is really not much we can advise without knowing more details about your application.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants