Suggestions for implementing a composition-based optimization (i.e. fractional portion of ingredients) #727

sgbaird · 2021-11-19T07:03:50Z

For starters, my experience with Ax is running the Loop tutorial once and reading through some of the documentation such as the parameter types (i.e. fairly new). Also, I have some familiarity with Bayesian optimization.

The actual use-case is slightly different and more complicated, but I think the following is a suitable toy example. I go over the problem statement, some setup code, and possible solutions. Would love to hear some feedback.

Problem Statement

Take a composite material with the following class: ingredient combinations:

Filler: Colloidal Silica (filler_A)
Filler: Milled Glass Fiber (filler_B)
Resin: Polyurethane (resin_A)
Resin: Silicone (resin_B)
Resin: Epoxy (resin_C)

Take some toy data of components and their fractional prevalences (various combinations of fillers and resins, and various numbers of components) along with their objective (training data), and some model which takes arbitrary input parameters and predicts the objective (strength) which we wish to maximize.

For constraints, I'm thinking:

limit the total number of components in any given "formula" (e.g. max of 3 components)
naturally, that the compositions sum to 1 (or that abs(1-sum(composition)) <= tol)
there has to be at least one filler and at least one resin (if feasible)

Setup Code

To make it more concrete, it might look like the following:

choices = ["filler_A", "filler_B", "resin_A", "resin_B", "resin_C", "dummy"]

data = [
        [["filler_A", "filler_B", "resin_C"], [0.4, 0.4, 0.2]],
        [["filler_A", "resin_A", "resin_B"], [0.6, 0.2, 0.2]],
        [["filler_A", "filler_B", "resin_B"], [0.5, 0.3, 0.2]],
        [["filler_A", "resin_B", "dummy"], [0.5, 0.5, 0.0]],
        [["filler_B", "resin_C", "dummy"], [0.6, 0.4, 0.0]],
        [["filler_A", "filler_B", "resin_A"], [0.2, 0.2, 0.6]],
        [["filler_B", "resin_A", "resin_B"], [0.6, 0.2, 0.2]],
        ] # made-up data

def predict(objects, composition):
    ...
    return obj

Possible Solutions

One-hot-like prevalence encoding and components/composition

One-hot-like prevalence encoding

I've thought about trying to do a sort of "one-hot encoding" (assuming I'm using this term correctly), such that each component gets its own composition as a variable:

filler_A	filler_B	resin_A	resin_B	resin_C
0.4	0.4	--	--	0.2
0.6	0.0	0.2	0.2	--
0.5	0.3	--	0.2	--
0.5	--	--	0.5	--
--	0.6	--	--	0.4
0.2	0.2	0.6	--	--
--	0.6	0.2	0.2	--

which I think would look like the following:

best_parameters, values, experiment, model = optimize(
    parameters=[
        {
            "name": "filler_A",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "filler_B",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "resin_A",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "resin_B",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "resin_C",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
    ],
    experiment_name="composition_test",
    objective_name="strength",
    evaluation_function=predict,
    parameter_constraints=["abs(1 - (filler_A + filler_B + resin_A + resin_B + resin_C)) <= 1e-6", "filler_A + filler_B > 0", "resin_A + resin_B + resin_C > 0"], # not sure if I can use `abs` here
    total_trials=30,
)

However, this could easily lead to compositions where all of the components have a finite prevalence and can be problematic from an experimental perspective.

components/composition

As I mentioned in the constraints, I've also thought about setting an upper limit to the number of components in a formula, which I think might look something like the following:

best_parameters, values, experiment, model = optimize(
    parameters=[
        {
            "name": "object1",
            "type": "choice",
            "bounds": choices,
        },
        {
            "name": "object2",
            "type": "choice",
            "bounds": choices,
        },
        {
            "name": "object3",
            "type": "choice",
            "bounds": choices,
        },
        {
            "name": "composition1",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "composition2",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "composition3",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
    ],
    experiment_name="composition_test",
    objective_name="strength",
    evaluation_function=predict,
    parameter_constraints=["abs(1 - (composition1 + composition2 + composition3)) <= 1e-6"],
    total_trials=30,
)

How would you suggest implementing this use-case in Ax? If it would help, I'd be happy to flesh this out into a full MWE or try out any suggestions. The real use-case involves ~100 different components across 4 different classes, and the idea is to (eventually) use this in an experimental adaptive design scheme.

(tag @ramz-i who is the individual in charge of this project in our research group, post here if you have anything to add)

#706

The text was updated successfully, but these errors were encountered:

lena-kashtelyan · 2021-11-19T18:13:21Z

Wow @sgbaird, this is one exemplary Github issue! : )

I think your idea about representing parameters as [0.0, 1.0] ranges is exactly right for this case. There are two challenges though:

1. Constraints are complicated to represent here.

I think the types of constraints you are looking for are not currently supported by Ax, at least not out-of-the box.

Just to make sure, my understanding is that the three constraints you'd like to have are:

Limit the total number of components in any given "formula" (e.g. max of 3 components),
The prevalences in compositions sum to 1,
There has to be at least one filler and at least one resin (if feasible).

Constraint 3) here is easy: it will in fact be two constraints: "filler_A + filler_B + ... >= 1e-6" and "resin_A + resin_B + ... >= 1e-6".

Constraints 1) and 2) are challenging though. For constraint 2), which is exact equality constraint, we have this wishlist issue open currently: #510.

2. The real use-case involving around 100 components and 4 classes of them.

This is too many parameters and constraints for standard Ax models, so this will also not work out-of-the-box.

Let me discuss this with @dme65, @Balandat, and @eytan, and get back to you with whether we can find a way to represent your problem in Ax.

Before we delve into this further, high-level question: how expensive will each predict function call be in your case? How many total trials could you run and how important running as few as possible vs. finding optimal outcome? I think depending on that, we might want to break up the search space into sub-search spaces with fewer components (as total of 100 is a lot).

sgbaird · 2021-11-19T22:26:09Z

Hi @lena-kashtelyan, I'm glad you like the issue 😄 I really appreciate your thorough and timely response. This is great!

I have some more thoughts on the constraints, I'm curious if the "components/composition" version above seems problematic (esp. with respect to symmetry concerns), and last, I tried to answer the higher-level questions you asked.

Constraints

Constraint 3 doesn't seem bad at all. Thank you! I think I have a workaround for 2, and I have a few more thoughts for 1.

Prevalence sums to 1

Perhaps a workaround for constraint 2 would be the following two constraints:

"filler_A + filler_B + resin_A + resin_B + resin_C <= 1.000001"
"filler_A + filler_B + resin_A + resin_B + resin_C >= 0.999999"

as a substitute for "abs(1-sum(...)) <= 1e-6"
Does this seem like it would work OK? I'm worried it might be a bit hacky and have some unintended consequences.

EDIT:
After reading #510 (comment), I see that the tighter the tolerance in the "hack" above, the tougher it is on the rejection sampling and would either result in an error or (likely) bloat the runtime.

raise SearchSpaceExhausted(
ax.exceptions.core.SearchSpaceExhausted: Rejection sampling error (specified maximum draws (10000) exhausted, without finding sufficiently many (1) candidates). This likely means that there are no new points left in the search space.

Limit the total number of components in any given formula

For constraint 1, maybe the L2 norm constraint could be used as a proxy for "complexity" of the suggested formula. For example:
norm([0.2 0.2 0.2 0.2 0.2]) --> 0.45 (higher complexity, lower score)
norm([0.4 0.4 0.2 0.0 0.0]) --> 0.60 (lower complexity, higher score)
If I'm thinking about this correctly, incorporating this into the objective function (predict) could bias the model towards less complex systems, but this probably pushes the problem into a multi-objective optimization (MOO) scheme and doesn't actually enforce an upper bound on the number of components. Rather than fiddle with the L2-norm constraint, at that point, it might make more sense to just use MOO and have the second objective be the number of components that exhibit a prevalence above some tolerance, e.g. count_components([0.4 0.4 0.2 1e-8 1e-7]) --> 3

Components/Composition version

Do you see any potential issues with using the "components/composition" version (i.e. sort of a hard-coded upper limit of 3 components)? Symmetry might be an issue (A B C --> 0.5 0.3 0.2 is degenerate relative to A C B --> 0.5 0.2 0.3), so I wondered if using an ordering constraint such as composition1 > composition2 > composition3 would remove most of the symmetry, but then there's still the issue of (A B C --> 0.41 0.39 0.2) being distant in parameter space to (B A C --> 0.41 0.39 0.2) despite actually being quite close. This seems very relevant to #710

e.g. when parameters (1,2) actually means the same as (2,1), we want to disallow one of them.

Is the main issue with this kind of symmetry just that it takes longer to search the space? Or is the worry that the search space might become unsmooth?

Additional Context

@ramseyissa may wish to comment more about the priorities of the project, but here are some thoughts to the best of my understanding:

A single predict function call should be very inexpensive: i.e. millions of calls should be trivial
It shouldn't be a problem to select a subset of the higher priority components (e.g. 10-50 components) rather than the full ~100 (curse of dimensionality - makes sense), I think that's what he's doing right now just to simplify the problem
My guess is that finding the optimal outcome is more important than running as few as possible since it's a few months into a 4-year project. IIRC, the experimental throughput is ~10/week and the current dataset is on the order of a few hundred. Right now @ramseyissa has been using a custom script that suggests 10 new formulas to try each week by looking at 1D slices (using a subset of components) in composition space where the data is sparse. I think he also incorporated some chemical intuition into the search, but he's looking for something more sophisticated that considers the full parameter space and the target property.

ramseyissa · 2021-11-20T22:23:22Z

Thanks @sgbaird for raising the issue and thank you @lena-kashtelyan for the quick response and feedback!

Hopefully I can clear up some of the project goals / issues that I am facing.
To begin, I am fairly new to Ax and its complete capabilities.

Project Overview

The dataset is close to 80 dimensions of different types of items that can be combined to make a single formulation. This high dimensional dataset is divided into 4 sections, SectionA/SectionB/SectionC/SectionD. We can think of these sections as divided up evenly for simplicity, so 20 items per section. From each section, 5 items are selected in various percentages summing to 1. To reduce the dimensionality in the dataset, I am holding SectionB/C/D constant and only looking to optimize Section A. So ideally what I would like to do, is of the 20 items in SectionA, pick 5 items at a time at various percentages and predict for this combination (the 5 items always summing to 1). Each prediction can be any combination of the 20 in SectionA (as long it chooses 5).

Useful Information

As @sgbaird stated, the predict function call is not of concern as it is inexpensive to run.
I am mainly concerned with finding the optimal prediction.
One dimension search through the dataset in regions that are sparse has already produced a higher performing sample. So I believe a higher level search through the dataset using BO would be ideal and can definitely help optimize this problem.

Thank you both again and I am open to any suggestions or references.

Balandat · 2021-11-21T18:45:40Z

So if the predict function is cheap to evaluate and you can potentially do millions of evaluations I think that Ax might just not be the right tool for this problem, as it is primarily designed for very costly settings where you may be able to get maybe a few hundred evaluations.

There are also quite a few challenges regarding the setup (in terms of the dimensionality of the inputs, limiting the number of components, etc.) that would not be straightforward to resolve. But even if we could resolve these, you might actually end up getting better results by using an optimization algorithm that can just evaluate tons of different configurations. Maybe some kind of evolutionary algorithm could work well here?

bernardbeckerman · 2021-11-22T15:30:33Z

Brief drive-by comment regarding the constraints in this problem: @sgbaird correct me if i'm wrong, but you really only have 4 free parameters in this case, since the requirement that they sum to 1 removes a degree of freedom. For this reason, it might be useful for you to set a constraint:

"filler_A + filler_B + resin_A + resin_B <= 1"

and then for each parameterization resin_C = 1 - (filler_A + filler_B + resin_A + resin_B).

Of course this may end up being moot per @Balandat's comment above, but if you were to end up using Ax this might be a better way to compose your search space. Does that make sense?

sgbaird · 2021-11-22T20:35:17Z

@Balandat that's a good point. A brute force method or an EA might be more suitable. Do you have any suggestions on implementing the acquisition function (i.e. expected improvement) with a custom model (i.e. predict)?

@bernardbeckerman that's exactly right. I hadn't thought of that. If we try something out in Ax, I think we'll use that parameterization.

Thanks for the suggestions!

Balandat · 2021-11-22T22:18:37Z

A brute force method or an EA might be more suitable. Do you have any suggestions on implementing the acquisition function (i.e. expected improvement) with a custom model (i.e. predict)?

I am not sure what exactly you're getting at here. If you use a brute force or EA, why would you need to implement an acquisition function?

sgbaird · 2021-11-23T01:26:00Z

In the adaptive design scheme (even if this involves a brute force or EA approach), we are still looking to find a happy medium between exploitation and exploration. With brute force or EA, it would be straightforward to try to maximize the objective function, strength for example, but it's less clear to me how we would incorporate the "exploration" component which is why I mentioned an acquisition function. Perhaps a better phrasing of the question is: with brute force or EA, how would you suggest incorporating a trade-off between exploration and exploitation similar to Bayesian optimization? My questions here could very well stem from a misunderstanding on my part about the theory or not explaining the problem clearly 😅.

It might also be worth mentioning that while we can run the predict function millions of times (e.g. random forest or neural network regression), we are probably limited to a few hundred adaptive design iterations per year (make a new sample, characterize it, retrain the predict model, determine next experiment, repeat).

lena-kashtelyan · 2021-11-30T22:35:58Z

cc @Balandat and @dme65, who I know was also thinking through this

Balandat · 2021-12-05T21:41:03Z

It might also be worth mentioning that while we can run the predict function millions of times (e.g. random forest or neural network regression), we are probably limited to a few hundred adaptive design iterations per year (make a new sample, characterize it, retrain the predict model, determine next experiment, repeat).

I see, maybe we misunderstood the problem to some extent. It seems to me that the predict function really is your surrogate model, and the thing you're actually trying to optimize is the "outer loop" of optimizing the design where an evaluation is the fabrication and characterization of a sample, is that correct?

The way Ax is typically used is by building its own (typically Gaussian Process) surrogate model internally. It's generaly possible to swap out that model (though that will be some work). My main question is whether your predict function have any uncertainty quantification (i.e. how confident it is about the prediction and what the prediction variance is)?

sgbaird · 2021-12-07T04:13:25Z

@Balandat thank you for your response. That's correct, if I'm using the right terminology, predict is our surrogate model (and my choice of description probably added to the misunderstanding), and yes, the "outer loop" we're interested in is the fabrication/characterization of a sample. I think there are a few things I may need more clarification on. For experimentalists using Ax, what are the typical analogs that correspond to some of the Loop tutorial functions:

Single evaluation of hartmann6 (is this typically a surrogate function, e.g. Gaussian Process Regression, user-defined surrogate function e.g. random forest, or an experiment/characterization iteration?)
Single evaluation of hartmann_evaluation_function
Single "trial" (e.g. out of total_trials=30) within optimize function

Our use case is very much a "suggest-next-experiment" adaptive design scheme for a limited/costly laboratory experimental dataset. For the current implementation (RandomForestRegressor), from what I can tell, nothing built-in or super straightforward for uncertainty quantification (UQ). But I am looking at modifying CrabNet (via my refactor) in the near future to work with arbitrary "ingredients lists" rather than only chemical formulas. CrabNet has built-in UQ via a transformer architecture. From what I understand, this is in the form of standard deviation for regression and a confidence score if the task is classification. Does that seem sufficient for the regression case?

We're open to using Ax's built-in GPR model with the existing small dataset (#743) in the short term while I'm implementing the CrabNet modifications.

Would love to hear your updated thoughts on how to approach this.

sgbaird · 2021-12-07T04:37:16Z

Also, we did get a working implementation, but it probably needs to be heavily modified in light of recent discussion. In particular, since our use-case falls into a human-in-the-loop category (i.e. manual laboratory experiments), we probably need the Developer API. I'm kicking myself for not reading https://ax.dev/docs/api.html and looking at the simple case comparisons, and also for dismissing the human-in-the-loop tutorial as being irrelevant to me (when I think of human-in-the-loop, I think of a human participating in the surrogate function evaluation, and the word "Developer" deterred me). While the following "works" (runs to completion and spits out best_parameters), it's round-about to the point of being inane, and doesn't really "work" in the sense of giving us our Bayesian-optimized "next experiment to run" for laboratory chemical synthesis. However, I think it's been a good deal of progress in terms of fleshing out the problem statement (esp. the constraints).

The training data is generated using a slightly modified version of components and compositions as described in the initial post #727 (comment), as well as some dummy data for the target values. The following is a MWE that considers n-1 degrees of freedom (n being the total number of components) #727 (comment) and is capable of extending at least to ~20 components #740 #694 (comment) via a custom generation strategy. It's also based primarily on the Loop API tutorial.

import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestRegressor

# training data
X_train = np.array([
       [0.4, 0.4, 0. , 0. , 0.2],
       [0.5, 0. , 0. , 0.5, 0. ],
       [0.5, 0.3, 0. , 0.2, 0. ],
       [0.5, 0. , 0. , 0.5, 0. ],
       [0. , 0.6, 0.4, 0. , 0. ],
       [0.6, 0. , 0.4, 0. , 0. ],
       [0. , 0.6, 0.2, 0.2, 0. ]])
X_train = pd.DataFrame(X_train, columns=["filler_A", "filler_B", "resin_A", "resin_B", "resin_C"])
y_train = 100 * np.random.rand(len(components))

unique_components = list(X_train.columns)

# surrogate model
RFR = RandomForestRegressor(max_depth=7, random_state=0)
RFR.fit(X_train.values, y_train)

def strength(x):
    x = np.reshape(x, (-1, x.shape[0]))
    y = RFR.predict(x)[0]
    return y

# Ax-specific code
def strength_evaluation_function(parameterization):
    x = np.array(
        [parameterization.get(component) for component in unique_components[:-1]]
    )
    last_component = 1 - sum(x)
    x = np.append(x, last_component)
    return {"strength": strength(x)}

parameters = [
    {"name": component, "type": "range", "bounds": [0.0, 1.0]}
    for component in unique_components[:-1]
]

separator = " + "
composition_constraint = separator.join(unique_components[:-1]) + " <= 1"

gs = GenerationStrategy(
    steps=[
        # 1. Initialization step (does not require pre-existing data and is well-suited for
        # initial sampling of the search space)
        GenerationStep(
            model=Models.SOBOL,
            num_trials=5,  # How many trials should be produced from this generation step
            min_trials_observed=3,  # How many trials need to be completed to move to next model
            max_parallelism=5,  # Max parallelism for this step
            model_kwargs={
                "seed": 999,
                "fallback_to_sample_polytope": True,
            },  # Any kwargs you want passed into the model
            model_gen_kwargs={},  # Any kwargs you want passed to `modelbridge.gen`
        ),
        # 2. Bayesian optimization step (requires data obtained from previous phase and learns
        # from all data available at the time of each new candidate generation call)
        GenerationStep(
            model=Models.GPEI,
            num_trials=-1,  # No limitation on how many trials should be produced from this step
            max_parallelism=3,  # Parallelism limit for this step, often lower than for Sobol
            # More on parallelism vs. required samples in BayesOpt:
            # https://ax.dev/docs/bayesopt.html#tradeoff-between-parallelism-and-total-number-of-trials
        ),
    ]
)

best_parameters, values, experiment, model = optimize(
    parameters=parameters,
    experiment_name="composition_test",
    objective_name="strength",
    evaluation_function=strength_evaluation_function,
    parameter_constraints=[
        composition_constraint,  # compositions sum to 1
    ],
    total_trials=30,
    generation_strategy=gs,
)

# Add the last degree of freedom back in
x = np.array([best_parameters.get(component) for component in unique_components[:-1]])
last_component = 1 - sum(x)
x = np.append(x, last_component)
next_experiment = best_parameters
next_experiment[unique_components[-1]] = last_component

sgbaird · 2021-12-07T11:37:47Z

I adapted what I had from a Loop into a Service API and fixed some of the theory/understanding issues on my part in the linked example: #743 (comment) such that I'm generating a real suggested next_experiment. The main gap in my understanding is that a single evaluation of hartmann6 in the examples is like a wet-lab synthesis/characterization iteration for us.

I'm still struggling with the n_components < max_components constraint #745.

I'm also still confused about how I would replace the GPR surrogate model with my own (which has a built-in uncertainty output).

sgbaird · 2021-12-07T20:25:32Z

For using my own surrogate model, should I be looking at https://ax.dev/tutorials/modular_botax.html?

EDIT: and if so, can this be passed into a GenerationStep model kwarg?

lena-kashtelyan · 2021-12-07T20:36:37Z

For using my own surrogate model, should I be looking at https://ax.dev/tutorials/modular_botax.html?

I think that's likely, but will let @Balandat chime in on that too.

sgbaird · 2021-12-08T05:19:41Z

I think we're getting close. For the "limit the total number of components" constraint (outstanding), we can at least temporarily prune off one idea. That is, to use outcome_constraint as a hacky workaround due to being unable to use non-linear parameter constraints (#745).

sgbaird · 2021-12-08T07:39:50Z

I dug a bit deeper into how to swap out a built-in Surrogate model with a custom model (#748), but it seems a custom model would need to be able to sample from its posterior distribution. The extent of UQ for my custom model (CrabNet) seems to be just a scalar uncertainty associated with the prediction. I figured that discussion about using a custom model might be best on a separate issue (#748). If it turns out to help the composition-based optimization discussed in this issue, then I can bring the main takeaway back here.

sgbaird · 2021-12-10T09:39:06Z

Here is a roadmap of some outstanding items as well as other features/topics that came up along the way. I'll plan on updating these later if the status changes (along with an "EDIT" keyword). If I may, I'm also cc'ing the people who have been participating or tagged in these discussions to bring attention to the "birds-eye" issue. Thank you to everyone who has helped me out so far. Your responses have been incredibly useful and informative. Personally, it's been very rewarding for me to get a better understanding of BO through the lens of a practical use case while using what I consider to be an excellent platform for it.

@lena-kashtelyan @Balandat @bernardbeckerman @eytan @saitcakmak @Ryan-Rhys @qingfeng10 @dme65 (@ramseyissa who is lead on the project)

`n_components < max_components` constraint

✔️ implement a one-hot-encoding scheme
- ❌/❓ use outcome_constraint as a workaround for constraining n_components Is it possible to specify an outcome_constraint on ax_client.get_next_trial() with an exactly known (non-objective) outcome? #745
- ❌/❓ use MOO as a workaround for minimizing n_components Suggestions for implementing a composition-based optimization (i.e. fractional portion of ingredients) #727 (comment) (rejecting based on the same notion as above)
✔️ implement a slot-based scheme
- ❌ use Parameter Constraints for ChoiceParameter Do the numeric values of ordinal ChoiceParameter choices affect the model? #750 (comment) and Feature request: allow for constraints on ordered choice parameters #710 (comment)
- ❔ use Parameter Constraints on an int RangeParameter with a Hamming-like kernel, but only for the component parameters (i.e. keep a Matern-like kernel for composition parameters) Do the numeric values of ordinal ChoiceParameter choices affect the model? #750 (comment)
- ✔️/❔ Perform data augmentation Do the numeric values of ordinal ChoiceParameter choices affect the model? #750 (comment)
- ✔️/❔ (EDIT: 2021-12-14) Change the Ax API as suggested by @Balandat to "take in some callable that evaluates the constraint and that we can pass to the optimizer ... [with] a big red warning sign" Is it possible to specify an outcome_constraint on ax_client.get_next_trial() with an exactly known (non-objective) outcome? #745 (comment) (also applicable to one-hot-encoding scheme above)

"use my own surrogate model"

✔️ Custom surrogate models can be incorporated into custom BoTorch modules and then passed to generation strategy, but surrogate models need to be able to sample from the posterior, not just return a scalar value for the uncertainty Implement a custom Surrogate model for use in generation strategies #748
- ❔ It seems possible to convert my surrogate model of interest (my refactor of CrabNet) into a Bayesian transformer architecture using e.g. BayesFormer Implement a custom Surrogate model for use in generation strategies #748 (comment); however, this could be fairly difficult to implement
- ❌/❓ (EDIT 2021-12-14) Ignore covariance and use scalar uncertainties Posterior model of random forest CitrineInformatics/lolo#254 (comment)

Adding multiple outcome measurements for fixed parameters as separate trials

✔️ simply add them as separate trials with no SEM specified assuming a low number of observations, otherwise, convert many observations to mean and SEM Ok to add stochastic repeated measurements for fixed input parameters? #752

Incorporate input/parameter uncertainty

For example, when you mix a bunch of components together, but there is some uncertainty in the final composition of each component (e.g. instrument resolution, losses during synthesis)

❌ incorporate it directly via ax_client.attach_trial [Feature Request] Ability to specify input uncertainties (uncertainties of measured parameters) #751 (comment)
✔️ ignore it completely when input/parameter uncertainty is negligible compared to outcome uncertainty [Feature Request] Ability to specify input uncertainties (uncertainties of measured parameters) #751 (comment)
❔ adapt the BoTorch robust design / risk-averse tutorial [Feature Request] Ability to specify input uncertainties (uncertainties of measured parameters) #751 (comment)
❔ fit a GP using the parameters of the distribution (i.e. mean and SEM) that describes the exact (unknown) composition [Feature Request] Ability to specify input uncertainties (uncertainties of measured parameters) #751 (comment)
❔ propagate the input uncertainty to the outcome uncertainty using a sampler (e.g. Gibbs) [Feature Request] Ability to specify input uncertainties (uncertainties of measured parameters) #751 (comment) and combine outcome_sigma_propagated and outcome_sigma via e.g. simple addition or law of total variance

Multi-objective optimization (e.g. strength, hardness)

While I haven't mentioned this one yet, we are interested in converting this to a MOO scheme. Note: this is different than what I mentioned above where I suggested using MOO to implement the n_components < max_components constraint. In this case, the MOO is for "real" outcomes (e.g. strength, hardness)

✔️/❔ following the MOO tutorial should be fairly straightforward, but I'm not sure if MOO will be incompatible with any of the above-mentioned features

lena-kashtelyan · 2021-12-10T16:55:23Z

@sgbaird, thank you for this summary, it's very helpful!

sgbaird · 2021-12-23T02:49:51Z

Great meeting with @lena-kashtelyan, @Balandat, and @dme65. No need to respond until after the break. I thought it might be good to try a few visualizations of the one-hot encoding scheme. There are four compositions (x, y, z, and color, where color = 1 - x - y - z). Then for different levels of max_components:

`max_components = 4`

A simplex

`max_components = 3`

A hollow simplex

`max_components = 2`

A wireframe simplex

`max_components = 1`

Simplex vertices

Takeaway

The constraint that n_components < max_components progressively removes higher dimensional facets of a simplex, leaving only the lower-dimensional constructs (e.g. "edges"). These facets are generally connected to each other, but the feasible region is similar to a hollow shape or a wireframe. It should be straightforward to determine the distance to the feasible region, but this also requires locating within the code where the feasibility weighting takes place.

Maybe here (except doesn't seem to be called using gen_next_trial, didn't stop on breakpoint):

Ax/ax/core/utils.py

Lines 125 to 155 in 3f10ec9

    
           def best_feasible_objective(  # pragma: no cover 
        
               optimization_config: OptimizationConfig, values: Dict[str, np.ndarray] 
        
           ) -> np.ndarray: 
        
               """Compute the best feasible objective value found by each iteration. 
        
               Args: 
        
                   optimization_config: Optimization config. 
        
                   values: Dictionary from metric name to array of value at each 
        
                       iteration. If optimization config contains outcome constraints, values 
        
                       for them must be present in `values`. 
        
               Returns: Array of cumulative best feasible value. 
        
               """ 
        
               # Get objective at each iteration 
        
               objective = optimization_config.objective 
        
               f = values[objective.metric.name] 
        
               # Set infeasible points to have infinitely bad values 
        
               infeas_val = np.Inf if objective.minimize else -np.Inf 
        
               for oc in optimization_config.outcome_constraints: 
        
                   if oc.relative: 
        
                       raise ValueError(  # pragma: no cover 
        
                           "Benchmark aggregation does not support relative constraints" 
        
                       ) 
        
                   g = values[oc.metric.name] 
        
                   feas = g <= oc.bound if oc.op == ComparisonOp.LEQ else g >= oc.bound 
        
                   f[~feas] = infeas_val 
        
               # Get cumulative best 
        
               minimize = objective.minimize 
        
               accumulate = np.minimum.accumulate if minimize else np.maximum.accumulate 
        
               return accumulate(f)

lena-kashtelyan · 2022-01-11T16:54:20Z

I'll close this issue in favor of #769 for now, since that issue will host discussion on a more narrow part of this work that we intend to focus on first. We can reopen this when/if there will be need, feel free to do so at any time @sgbaird!

sgbaird · 2022-03-29T01:47:40Z

I realized that there is an issue with using Sobol sampling and the one-fewer-degrees-of-freedom style compositional constraint, and wanted to make a record of it in this issue.

Take the case of:

x1+x2+x3 == 1

reparameterized to:

x1+x2 <= 1

where x3 == 1 - x1 - x2. When the Sobol candidates are generated, it is completely unaware of the original representation involving x3 during the sampling. If the bounds on x1 and x2 are [0,1], then it will preferentially favor the exploration of higher values of x1 and x2 than that of the unseen x3, if I'm not mistaken.

I'm guessing to work around this, I need to manually generate the Sobol points using e.g. ax.modelbridge.factory.get_sobol with the original parameterization, remove x3, and then attach them as if they were manually generated trials (with the caveat that they will appear as MANUAL trials rather than SOBOL trials). Does that workflow seem reasonable? Let me know if you have thoughts of a better way. (EDIT: see comment below)

In the context of #769, this isn't an issue if I pass an equality constraint (i.e. the original constraint) directly to botorch rather than using SumConstraint #769 (comment), subject to the caveat mentioned that it doesn't play well with transforms (but shouldn't really matter if the search space is [0, 1]^d anyway).

sgbaird · 2022-04-02T06:18:44Z

Actually, passing a linear equality constraint may be the only way to prevent bias in the Sobol sampling relative to the "true" search space, because the Sobol sampling still needs to respect the constraints (i.e. it has to see some constraint). If I pass a linear inequality constraint rather than a linear equality constraint, the bias will occur.

sgbaird · 2022-12-03T22:00:56Z

@jduerholt recently suggested the use of optimize_acqf_mixed within BoTorch based on the combinations from an NchooseK style constraint (i.e. only up to a certain number of elements allowed to be active for a given experiment). In my case, 20 choose 6 = 38760 combinations might be unwieldy without HPC resources; however, this suggestion seems like the most straightforward and rigorous option for cases where this is computationally tractable.

This seems like a great option to compare against the nonlinear nchoosek constraint approximation #769. Specifically in https://github.com/sparks-baird/optimization-benchmark (sparks-baird/optimization-benchmark#7)

lena-kashtelyan self-assigned this Nov 19, 2021

lena-kashtelyan added the question Further information is requested label Nov 19, 2021

sgbaird mentioned this issue Nov 19, 2021

[Feature Request] Support Exact Equality Parameter Constraints #510

Closed

sgbaird mentioned this issue Dec 5, 2021

Can the Loop API be used with an increased number of sampling points for higher-dimensional problems? #740

Closed

sgbaird mentioned this issue Dec 7, 2021

How to use Ax with an existing, small dataset of (effectively randomly) sampled points? #743

Closed

sgbaird mentioned this issue Dec 7, 2021

Is it possible to specify an outcome_constraint on ax_client.get_next_trial() with an exactly known (non-objective) outcome? #745

Closed

This was referenced Dec 8, 2021

Feature request: allow for constraints on ordered choice parameters #710

Closed

Are the inputs molecule-based or composition-based? ClancyLab/NCM2018#3

Open

This was referenced Dec 15, 2021

Do the numeric values of ordinal ChoiceParameter choices affect the model? #750

Closed

What kind of constraints on parameter bounds are supported natively? modAL-python/modAL#149

Closed

This was referenced Jan 3, 2022

[FEATURE REQUEST] modify Ax API to allow for callable that evaluates a constraint and is passed to the optimizer #769

Closed

optimization of chemistry reactions #706

Closed

lena-kashtelyan closed this as completed Jan 11, 2022

sgbaird mentioned this issue Apr 8, 2022

Reparameterizing compositional linear equality constraints as linear inequality constraints introduces Sobol sampling bias #903

Closed

sgbaird mentioned this issue Jul 8, 2022

Offline discussion about the benchmark sparks-baird/optimization-benchmark#3

Open

sgbaird mentioned this issue Dec 3, 2022

Comparison between methods treating nchoosek constraints sparks-baird/optimization-benchmark#7

Open

sgbaird mentioned this issue Mar 18, 2023

[Feature Request] Cardinality constraint pytorch/botorch#1749

Closed

eytan mentioned this issue Apr 4, 2023

[Question] Number of trials and batches in offline optimization + tracking issue to learn of experiment outcomes #1562

Closed

sgbaird mentioned this issue Aug 4, 2023

Initializing experiment with data points outside the experiment search space (a.k.a. out-of-design points) #768

Closed

sgbaird mentioned this issue Feb 8, 2024

Out-of-design trials may cause repeated trials #1568

Open

sgbaird mentioned this issue Nov 13, 2024

Composition constraint not satisfied? sgbaird/honegumi#71

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggestions for implementing a composition-based optimization (i.e. fractional portion of ingredients) #727

Suggestions for implementing a composition-based optimization (i.e. fractional portion of ingredients) #727

sgbaird commented Nov 19, 2021 •

edited

Loading

lena-kashtelyan commented Nov 19, 2021 •

edited

Loading

sgbaird commented Nov 19, 2021 •

edited

Loading

ramseyissa commented Nov 20, 2021 •

edited

Loading

Balandat commented Nov 21, 2021

bernardbeckerman commented Nov 22, 2021

sgbaird commented Nov 22, 2021

Balandat commented Nov 22, 2021

sgbaird commented Nov 23, 2021 •

edited

Loading

lena-kashtelyan commented Nov 30, 2021 •

edited

Loading

Balandat commented Dec 5, 2021

sgbaird commented Dec 7, 2021

sgbaird commented Dec 7, 2021 •

edited

Loading

sgbaird commented Dec 7, 2021 •

edited

Loading

sgbaird commented Dec 7, 2021 •

edited

Loading

lena-kashtelyan commented Dec 7, 2021

sgbaird commented Dec 8, 2021 •

edited

Loading

sgbaird commented Dec 8, 2021 •

edited

Loading

sgbaird commented Dec 10, 2021 •

edited

Loading

lena-kashtelyan commented Dec 10, 2021

sgbaird commented Dec 23, 2021 •

edited

Loading

lena-kashtelyan commented Jan 11, 2022

sgbaird commented Mar 29, 2022 •

edited

Loading

sgbaird commented Apr 2, 2022 •

edited

Loading

sgbaird commented Dec 3, 2022 •

edited

Loading

Suggestions for implementing a composition-based optimization (i.e. fractional portion of ingredients) #727

Suggestions for implementing a composition-based optimization (i.e. fractional portion of ingredients) #727

Comments

sgbaird commented Nov 19, 2021 • edited Loading

Problem Statement

Setup Code

Possible Solutions

One-hot-like prevalence encoding

components/composition

lena-kashtelyan commented Nov 19, 2021 • edited Loading

1. Constraints are complicated to represent here.

2. The real use-case involving around 100 components and 4 classes of them.

sgbaird commented Nov 19, 2021 • edited Loading

Constraints

Prevalence sums to 1

Limit the total number of components in any given formula

Components/Composition version

Additional Context

ramseyissa commented Nov 20, 2021 • edited Loading

Project Overview

Useful Information

Balandat commented Nov 21, 2021

bernardbeckerman commented Nov 22, 2021

sgbaird commented Nov 22, 2021

Balandat commented Nov 22, 2021

sgbaird commented Nov 23, 2021 • edited Loading

lena-kashtelyan commented Nov 30, 2021 • edited Loading

Balandat commented Dec 5, 2021

sgbaird commented Dec 7, 2021

sgbaird commented Dec 7, 2021 • edited Loading

sgbaird commented Dec 7, 2021 • edited Loading

sgbaird commented Dec 7, 2021 • edited Loading

lena-kashtelyan commented Dec 7, 2021

sgbaird commented Dec 8, 2021 • edited Loading

sgbaird commented Dec 8, 2021 • edited Loading

sgbaird commented Dec 10, 2021 • edited Loading

n_components < max_components constraint

"use my own surrogate model"

Adding multiple outcome measurements for fixed parameters as separate trials

Incorporate input/parameter uncertainty

Multi-objective optimization (e.g. strength, hardness)

lena-kashtelyan commented Dec 10, 2021

sgbaird commented Dec 23, 2021 • edited Loading

max_components = 4

max_components = 3

max_components = 2

max_components = 1

Takeaway

lena-kashtelyan commented Jan 11, 2022

sgbaird commented Mar 29, 2022 • edited Loading

sgbaird commented Apr 2, 2022 • edited Loading

sgbaird commented Dec 3, 2022 • edited Loading

sgbaird commented Nov 19, 2021 •

edited

Loading

lena-kashtelyan commented Nov 19, 2021 •

edited

Loading

sgbaird commented Nov 19, 2021 •

edited

Loading

ramseyissa commented Nov 20, 2021 •

edited

Loading

sgbaird commented Nov 23, 2021 •

edited

Loading

lena-kashtelyan commented Nov 30, 2021 •

edited

Loading

sgbaird commented Dec 7, 2021 •

edited

Loading

sgbaird commented Dec 7, 2021 •

edited

Loading

sgbaird commented Dec 7, 2021 •

edited

Loading

sgbaird commented Dec 8, 2021 •

edited

Loading

sgbaird commented Dec 8, 2021 •

edited

Loading

sgbaird commented Dec 10, 2021 •

edited

Loading

`n_components < max_components` constraint

sgbaird commented Dec 23, 2021 •

edited

Loading

`max_components = 4`

`max_components = 3`

`max_components = 2`

`max_components = 1`

sgbaird commented Mar 29, 2022 •

edited

Loading

sgbaird commented Apr 2, 2022 •

edited

Loading

sgbaird commented Dec 3, 2022 •

edited

Loading