You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think there is an issue with min_grid() when we have postprocessors being tuned (found while working on #974).
Reminder: min_grid() is about submodels and figures out the tuning parameter combinations that need to actually fit. For example, if a boosted tree evaluates trees = c(1, 5, 10), we only need to fit the model with trees = 10 and then use that model to predict the results with the smaller number of trees.
If there are preprocessing, tuning parameters, it finds the combinations of preprocessing and model tuning parameters that should be fit.
Since we are now adding postprocessors, we are treating those as preprocessing parameters when we find the unique values of the tuning parameters that should be fit.
Here’s an example with no preprocessor, two model parameters, and one postprocessors are being tuned.
Rows 2 and 3 should not be split out since they have the same min_n; the tibble should have three rows since we only need to fit three preprocessing and model combinations.
It makes four rows since it takes lower_limit into account. However, that parameter is used after the model fit, so it should not affect what happens before the model.
What I’m not sure about is what should happen instead, i.e., where to put lower_limit? Do we need that column in the results of min_grid() at all? etc...
More to come.
The text was updated successfully, but these errors were encountered:
The solution is to remove the postprocessing parameter columns from the grid, determine the distinct candidates in the remainder, then run min_grid() on this partial grid.
The postprocessing parameters go into their respective sub-tibbles prior to this so they are not affected.
This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.
I think there is an issue with
min_grid()
when we have postprocessors being tuned (found while working on #974).Reminder:
min_grid()
is about submodels and figures out the tuning parameter combinations that need to actually fit. For example, if a boosted tree evaluatestrees = c(1, 5, 10)
, we only need to fit the model withtrees = 10
and then use that model to predict the results with the smaller number of trees.If there are preprocessing, tuning parameters, it finds the combinations of preprocessing and model tuning parameters that should be fit.
Since we are now adding postprocessors, we are treating those as preprocessing parameters when we find the unique values of the tuning parameters that should be fit.
Here’s an example with no preprocessor, two model parameters, and one postprocessors are being tuned.
Created on 2024-12-18 with reprex v2.1.0
Rows 2 and 3 should not be split out since they have the same
min_n
; the tibble should have three rows since we only need to fit three preprocessing and model combinations.It makes four rows since it takes
lower_limit
into account. However, that parameter is used after the model fit, so it should not affect what happens before the model.What I’m not sure about is what should happen instead, i.e., where to put
lower_limit
? Do we need that column in the results ofmin_grid()
at all? etc...More to come.
The text was updated successfully, but these errors were encountered: