Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mobo enhancement #248

Merged
merged 9 commits into from
Oct 29, 2024
Merged

Mobo enhancement #248

merged 9 commits into from
Oct 29, 2024

Conversation

roussel-ryan
Copy link
Collaborator

  • enhances multi-objective bayesian optimization algorithms by adding an use_pf_as_initial_points flag which uses points on the pareto frontier to initialize optimization of the EHVI acquisition function which results in substantial speed-up of convergence to the pareto front in high-dimensional input spaces

@nikitakuklev
Copy link
Collaborator

nikitakuklev commented Oct 28, 2024

A general comment is that I'd suggest we do a more complete implementation with more randomization as a followup PR. There is also a question of how to compute feasibility - just drop infeasible points like is done now, or sample each candidate to determine things probabilistically, potentially getting more accurate borderline candidates.

More specifically:

  • Botorch does a more complicated process with posterior sampling + pareto downselect in the pruning function for qNEHVI family - see here. Note how if any of the samples at a baseline point are infeasible, the point is set equal to reference and excluded from pareto front. This encodes noise knowledge of the model into surviving candidates and should in general be better than removing the pareto point based on observed data, at the cost of performance.

  • In fact, since prune_inferior_points_multi_objective will frequently get called in the acquisition function, it might be useful to cache baseline as part of pf initialization and feed results as new X_baseline + prune_baseline=false to save on repeating pareto front computation.

  • For choosing which points to use when len(initial_points) > num_restarts, it might be good to use stochastic behavior. First, use 'around best' logic (see here) to generate raw_pf_samples points, with raw_pf_samples = num_restarts*factor. Then, use the same stochastic logic as in Botorch raw_samples parameter. Namely, compute acquisition function at all points and pick exactly num_restarts probabilistically with bias for higher acquisition values (see here). One can make an argument that acquisition functions values will be pretty similar around each point unless perturbations are large, and thus this complicated procedure will not be particularly useful. A simpler solution is to only generate at most 'num_restarts' candidates without downselect, for example by picking num_restarts pareto points using above weighted procedure and then generating 1 nudged candidate per point. Need to benchmark to see if that is worth it. The overall goal here is to make initialization not use completely random parts of pareto front, but be softly biased towards more promising areas.

  • It would be interesting to plot how many points are on pareto front vs dimensionality. Have a feeling num_restarts might need to be scaled a lot for larger problems if the fully random scheme is kept.

xopt/vocs.py Outdated
observable_data = self.observable_data(data, "")

if return_valid:
feasable_status = self.feasibility_data(data)["feasible"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

Copy link
Collaborator

@nikitakuklev nikitakuklev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor concerns with going to GPU and a lot of redoing of calcs - probably a small thing compared to main MOBO loop time. Otherwise LGTM.

supports_batch_generation: bool = True
use_pf_as_initial_points: bool = Field(
False,
description="flag to specify if pf front points are to be used during "
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

use_pf_as_initial_points=True,
)
gen.add_data(test_data)
gen._get_initial_conditions()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verify that infeasible candidate did not make it

@@ -67,7 +70,7 @@ class GridOptimizer(NumericalOptimizer):
10, description="number of grid points per axis used for optimization"
)

def optimize(self, function, bounds, n_candidates=1):
def optimize(self, function, bounds, n_candidates=1, **kwargs):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert empty kwargs if none are expected

)
non_dominated = is_non_dominated(obj_data)

weights = set_botorch_weights(self.vocs).to(**self._tkwargs)[
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you reuse weight from _get_scaled_data() and avoid recomputing?

]
variable_data = torch.tensor(var_df[self.vocs.variable_names].to_numpy())
objective_data = torch.tensor(obj_df[self.vocs.objective_names].to_numpy())
weights = set_botorch_weights(self.vocs).to(**self._tkwargs)[
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't benchmarked this but going to GPU might be quite slow for our small dataset sizes

@roussel-ryan
Copy link
Collaborator Author

@nikitakuklev for the record here, I'll reiterate that we are happy to incorporate your suggested improvements to this process in a future PR

@roussel-ryan roussel-ryan merged commit 8e54995 into main Oct 29, 2024
14 checks passed
@roussel-ryan roussel-ryan deleted the mobo-enhancement branch October 29, 2024 21:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants