-
-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow duplicate objects in Pipeline and ColumnTransformer #638
Comments
This is a problem of the OpenML Flow definition, as defined in the early days of OpenML (2012). There is currently no uniform way to specify to which specific instance of the flow a hyperparameter setting in a run belongs, and as such having multiple instantiations of the same subflow in a complex flow does not allow for reproducible research. It has been on the agenda to improve this server side, however no one has started programming / testing alternatives. |
Thanks, that clarifies a lot. Does it make sense to leave this issue open as it will go unresolved? Or should I close it as 'we' on the package side can not fix this until the definitions are updated? |
I think closing and referencing the corresponding issue on the OpenML issue tracker is the way to go here: openml/OpenML#340 |
Reopening to show that this is a known issue. |
Marked it as wontfix because we won't (can't) fix this until we rework the flow definition. |
Currently neither
Pipeline
norColumnTransformer
may contain two different steps with the same type of transformer. I think this should be allowed.Consider a scenario where I have a dataset with numeric and categorical values (e.g. feature 1 and 2, respectively), and wish to impute them with a different imputation strategy. I would use the following code (with openml on head of
develop
):I would assume this should work, but it raises the following error:
Similarly an error is raised if a pipeline contains two steps of the same type.
What is the reason this error is raised? Is it simply not yet supported? Or should I be ordering my workflow differently, and if so, how?
The text was updated successfully, but these errors were encountered: