Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Botorch with cardinality constraint via sampling #301

Draft
wants to merge 69 commits into
base: main
Choose a base branch
from

Conversation

Waschenbacher
Copy link
Collaborator

@Waschenbacher Waschenbacher commented Jul 3, 2024

(edited by @AdrianSosic)

This PR adds support for cardinality constraints to BotorchRecommender. The core idea is to tackle the problem in an exhaustive search like manner, i.e. by

  • enumerating the possible combinations of in-/active parameters dictated by the cardinality constraints
  • optimizing the corresponding restricted subspaces, where the cardinality constraint can then be simply removed since the in-/active sets are fixed within these subspaces
  • aggregating the optimization results of the individual subspaces into a single recommendation batch.

The PR implements two mechanisms for determining the configuration of inactive parameters:

  • When the combinatorial list of possible inactive parameter configurations is not too large, we iterate the full list
  • otherwise, a fixed amount of inactive parameter configurations is randomly selected

The current aggregation step is to simply optimize all subspaces independently of each other and then return the batch from the subspace where the highest acquisition value is achieved. This has the side-effect that the set of inactive parameters is the same across the entire recommendation batch. This can be a desirable property in many use cases but potentially higher acquisition values can be obtained by altering the in-/activity sets across the batch. A simple way to achieve this (though out of scope for this PR) is by generalizing the sequential greedy principle to multiple subspaces.

Out of scope

  • Fulfilling cardinality constraints by passing them in suitable form as nonlinear constraints to the optimizer
  • Sequential greedy optimization to achieve varying in-/activity sets (see explanation above)

@Waschenbacher Waschenbacher added the new feature New functionality label Jul 3, 2024
@Waschenbacher Waschenbacher self-assigned this Jul 3, 2024
@Waschenbacher Waschenbacher marked this pull request as draft July 4, 2024 07:38
@Waschenbacher Waschenbacher force-pushed the feature/cardinality_constraint_to_botorch_via_sampling branch from 0be3285 to 1f7783d Compare July 4, 2024 09:21
@Waschenbacher Waschenbacher marked this pull request as ready for review July 4, 2024 09:35
CHANGELOG.md Outdated Show resolved Hide resolved
Copy link
Collaborator

@AdrianSosic AdrianSosic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Waschenbacher, thanks for the work. This is not yet a review but only a first batch of very high-level comments that I would like you to address before we can go into the actual review process. The reason being that the main functionality brought by this PR is currently rather convoluted and hard to parse, so I'd prefer to to work with a more readable version, tbh

baybe/constraints/continuous.py Outdated Show resolved Hide resolved
baybe/constraints/continuous.py Outdated Show resolved Hide resolved
baybe/recommenders/pure/bayesian/botorch.py Outdated Show resolved Hide resolved
baybe/recommenders/pure/bayesian/botorch.py Outdated Show resolved Hide resolved
baybe/recommenders/pure/bayesian/botorch.py Outdated Show resolved Hide resolved
baybe/searchspace/continuous.py Outdated Show resolved Hide resolved
baybe/searchspace/continuous.py Outdated Show resolved Hide resolved
baybe/searchspace/continuous.py Outdated Show resolved Hide resolved
@Waschenbacher Waschenbacher force-pushed the feature/cardinality_constraint_to_botorch_via_sampling branch from 1f7783d to 9d28b49 Compare July 12, 2024 07:38
@AdrianSosic AdrianSosic force-pushed the feature/cardinality_constraint_to_botorch_via_sampling branch from f68ce4f to fc52875 Compare August 15, 2024 12:52
Copy link
Collaborator

@AdrianSosic AdrianSosic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Waschenbacher, I've finally managed to spend some time on this important PR, thanks again for your preparation work. I've already refactored some parts that were clear to me. However, I'm not yet certain about the design at the remaining places. I have the feeling that it can potentially be simplified a lot, depending on whether it is possible to reuse the idea of reduced subspaces. Have marked the corresponding places with comments. I think we need to discuss this part first before we can continue.

baybe/searchspace/continuous.py Outdated Show resolved Hide resolved
baybe/recommenders/pure/bayesian/botorch.py Show resolved Hide resolved
Copy link
Collaborator

@AVHopp AVHopp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First high-level review regarding points that we should discuss. Not a full review yet as I think that the code might change depending on what is decided regarding the min_cardinality.

baybe/parameters/numerical.py Show resolved Hide resolved
baybe/parameters/utils.py Show resolved Hide resolved
baybe/parameters/utils.py Show resolved Hide resolved
@@ -76,6 +85,13 @@ class BotorchRecommender(BayesianRecommender):
optimization. **Does not affect purely discrete optimization**.
"""

max_n_subspaces: int = field(default=10, validator=[instance_of(int), ge(1)])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that it should be linked to cardinality somehow: For people that are not interested in cardinality constraints, it is only clear after reading the docstring that this is not interesting for them. This would however make the name quite long :/ So although I am not perfectly happy with the name, we could keep it as I do not see a better alternative while keeping it here.

subspace_continuous: SubspaceContinuous,
batch_size: int,
) -> tuple[Tensor, Tensor]:
"""Recommend from a continuous search space with cardinality constraints.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it, only some information about how max_n_subspaces comes into play here is still missing for me :)

baybe/recommenders/pure/bayesian/botorch.py Show resolved Hide resolved
@@ -301,6 +334,45 @@ def _drop_parameters(self, parameter_names: Collection[str]) -> SubspaceContinuo
],
)

def _enforce_cardinality_constraints_via_assignment(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there is only one way of enforcing cardinality consraints, why not simply enforce_cardinality_constraints? If people are interested in the details of the how, they can read the docstring.

@Scienfitz Scienfitz marked this pull request as draft November 26, 2024 12:48
@Waschenbacher
Copy link
Collaborator Author

@AdrianSosic This is the updated PR according to our discussion in the baybathon. The main changes are below:

  • A threshold near_zero_threshold is added to each parameter NumericalContinuousParameter. This implementation deviates from the original plan of

    • Threshold for each constraint or subspace vs threshold for each parameter. I decide for a threshold for one parameter because I think this way is more flexible and more logical
    • Threshold ratio (the original suggestion) vs threshold. I decide for threshold since the logic gets complicated with ratio of threshold and there are several doubts when I want to infer the absolute threshold from the ratio of threshold:
      • Is the threshold ratio based on the complete region or just one side?
      • If the bounds are not symmetrical, e.g. a parameter has bounds (a, b)with a!=b. How to infer the absolute threshold values? Do we want to keep a single absolute threshold or two different numbers?
      • How to deal with infinite bounds?
      • The user can infer the desired threshold from any threshold ratio, if needed, and assign them.
  • Check whether any minimum cardinality constraint is violated in botorch's recommendation and raise the warning MinimumCardinalityViolatedWarning if it occurs.

  • Added a test catching the MinimumCardinalityViolatedWarning is raised when any minimum cardinality constraint is violated.

  • Update the test related on cardinality constraint: all maximum cardinality constraints are fulfilled and either all minimum cardinality constraints are fulfilled or a warning is raised if not.

  • Created a botorch PR (Add InfeasibilityError exception pytorch/botorch#2652) for the dangerous ValueError due to infeasibility and added TODO.

baybe/constraints/continuous.py Show resolved Hide resolved
baybe/constraints/continuous.py Show resolved Hide resolved
baybe/parameters/numerical.py Outdated Show resolved Hide resolved
baybe/parameters/utils.py Show resolved Hide resolved
@@ -41,6 +49,9 @@ def validate_constraints( # noqa: DOC101, DOC103
param_names_discrete = [p.name for p in parameters if p.is_discrete]
param_names_continuous = [p.name for p in parameters if p.is_continuous]
param_names_non_numerical = [p.name for p in parameters if not p.is_numerical]
params_continuous: list[NumericalContinuousParameter] = [
p for p in parameters if isinstance(p, NumericalContinuousParameter)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

@AVHopp AVHopp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an incomplete review as I was not aware that there is still stuff being worked on. Feel free to either already include my comments or just resolve them.

baybe/parameters/numerical.py Outdated Show resolved Hide resolved

Important:
Value in the open interval (-near_zero_threshold, near_zero_threshold)
will be treated as near_zero.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Insonsistencies regarding the use of near_zero and near-zero in this docstring.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is renamed to zeros in the updated version. Resolve it for now. If near_zero is preferred, I'm open to rename it back.

baybe/parameters/numerical.py Show resolved Hide resolved
baybe/parameters/utils.py Show resolved Hide resolved
baybe/parameters/utils.py Show resolved Hide resolved
cardinality is violated in `BotorchRecommender`
- Attribute `max_n_subspaces` to `BotorchRecommender`, allowing to control
optimization behavior in the presence of multiple subspaces
- Utilities `inactive_parameter_combinations` and`n_inactive_parameter_combinations`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these as well as the utilities noted below are not user-facing, are they? In that case, I do not think that it is necessary to include them in the CHANGELOG


return pd.DataFrame(points, columns=subspace_continuous.parameter_names)

def _recommend_continuous_torch(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need an additional function for that? In my opinion, this check could be done in the original function, and this function just adds another layer of complexity. Also, I think the name is weird, why the explicit mention of torch?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_recommend_continuous is the outer layer and does some checks and returns pd.DataFrame, while _recommend_continuous_torch is only responsible of returning points, acqf_value with type tensor. Moreover, _recommend_continuous_torch is needed in _optimize_continuous_subspaces (see https://github.com/emdgroup/baybe/pull/301/files#r1825774190). Correct me if I'm wrong @AdrianSosic

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly, it's to fit the required interfaces on both sides. _recommend_continuous still operates on the dataframe level as it needs to return outputs that are then shipped to the user. But there are also places like _optimize_continuous_subspaces where you still need access to the low-level output like the corresponding acqf values. The only (sort of reasonable) way I saw to achieve both here is to extract that inner part and declare a separate function for it. But I'd be very happy if you see a more elegant alternative 👍🏼

subspace_continuous: SubspaceContinuous,
batch_size: int,
) -> tuple[Tensor, Tensor]:
"""Recommend from a continuous search space with cardinality constraints.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still the case.

baybe/recommenders/pure/bayesian/botorch.py Show resolved Hide resolved
baybe/recommenders/pure/bayesian/botorch.py Show resolved Hide resolved
@Waschenbacher Waschenbacher force-pushed the feature/cardinality_constraint_to_botorch_via_sampling branch from ee54c89 to bb1fc3d Compare January 13, 2025 21:42
@Waschenbacher Waschenbacher force-pushed the feature/cardinality_constraint_to_botorch_via_sampling branch from bb1fc3d to bddab62 Compare January 14, 2025 09:46
@AVHopp
Copy link
Collaborator

AVHopp commented Jan 17, 2025

Since there seems to be the wish to merge this soon, please put this out of draft mode if it is indeed ready for review. I will not have a look a this earlier, since I assume that being in Draft mode means that it is not yet ready (which however conflicts with some of the comments that I see here)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature New functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants