-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Curious behavior of model.set_data()
and control_loop.get_next_points()
#313
Comments
For those encountering the same issue, here's a functional work-around:
|
Hi @ekalosak , yes, the model is updated with the results that are stored in the loop state, at least if you use the appropriate model updater. Hence your workaround works in that case. For another (custom) model updater that does not use the loop state object it may not work. I am curious, where do you need that functionality? I.e., that you replace the data of the model. I am asking because the active learning loop is usually used precisely to collect the data. if you replace it in the middle, you could have just started with that other data. |
To address your curiosity: consider an experiment in which we have imprecise knowledge about the allowable discrete elements of the objective function's domain. What's more, we don't get the precise point in the domain associated with a particular experiment until some time after the primary experiment is performed. An example might be a combinatorial material science application where certain a priori unknown configurations of material properties are impossible to fabricate, some desired properties are only possible approximately, etc. Our goal is to improve conductive properties, e.g., and the conductivity is easy to test so we get our measurements post-fabrication quickly. However, measuring the actual fabricated properties is difficult, takes time, and comes in batches because we sent samples in batches to an external lab. Note that these material properties are part of the design space, not the objective functions's co-domain. It might be attractive to suggest multi-fidelity optimization, but doubling the number of free parameters seems problematic when we're shooting for sample-efficiency. tl;dr the X data are generated with incomplete information about the allowable domain, so it's useful to be able to adjust the model data as we go when more precise information about the actually implemented X data becomes available. |
To not get too off-track: does it make sense to have the Or, perhaps, the best design is just to isolate the data modification to the loop_state results and let the control_loop.get_next_points() do the model.set_data() call. |
Hi Emikit team,
First, thank you for your work on this package - it's a joy to use.
I'm writing with a question about some curious behavior I've observed when using the Bayesian optimization control loop. When I use the
IModel.set_data(X, Y)
class method to alter the model data followed by theOuterLoop.get_next_points(results)
, the model's data is reset to what it was before theset_data()
call with an extra row representing the contents of theresults
object.The expected behavior is to see, after the
OuterLoop.get_next_points(results)
call, the model data constituted by theX
passed toset_data
concatenated with the contents ofresults
.Here's a minimal example that reproduces the behavior:
The text was updated successfully, but these errors were encountered: