-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Isolation Forest grid search test and API [nocheck] #5660
base: master
Are you sure you want to change the base?
Conversation
304f579
to
12dd641
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not introduce a flag is_supervised. Note: API changes like this one should be discussed in #h2o-3-api-updates.
I am not introducing new one, flag is_supervised is already there, I am just using it properly |
@valenad1 I am sorry - my bad - I misunderstood the PR. I am surprised to see the option there - I don't think we should have it Would there be a solution that doesn't utilize this flag and works out of the box? |
I'll figure something out :) |
@@ -21,7 +21,9 @@ test.grid.resume <- function() { | |||
x = 1:4, | |||
y = 5, | |||
training_frame = iris.hex, | |||
hyper_params = hyper_parameters | |||
is_supervised = TRUE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to set the flag here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made IF unsupervised by default. This test is using the y parameter and then it should be supervised.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am very surprised this works - I thought IF can only have a labeled validation frame, not the training frame. Hmm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't check the output of the grid search, since I have only used API I didn't think about that. Good point 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am trying to say that I didn't change the test, I just focused that the same code will be used with this flag. I will check what is the actual output of the grid search :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think given that the option is_supervised
already exists and enables the functionality to work - we should get this fix in.
@valenad1 this looks like this was a code change (in models.R) and not just a test change - in this case we should have a jira
Agree, I will make a JIRA So I merge this fix and after implement the change that will drop is_supervised parameter. Is that correct? @michalkurka |
If I read this PR correctly, the option didn't work before, or at least not in recent years. If so - it was unused - and thus could be gracefully deprecated (= eg. warning in 3.34 and then removal in 3.36) |
I agree.
Ok, I'll try to use a different way (somehow it works for python without this parameter) and make this parameter depracated. |
In my opinion grid search for Isolation Forest should be unsupervised by default and supervised on option.
12dd641
to
f94b440
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not expose is_supervised flag
I have noticed some inconsistency in IF grid search test for R.
The R API treat IF as supervised algorithm and flag is_supervised isn't working properly. This PR provide fix for is_supervised flag and change that IF will be unsupervised by default.
I have created second PR in case that we cannot suddenly change IF to unsupervised for grid search. This PR (#5661) only fix is_supervised flag and add test that IF can be used with is_supervised = FALSE flag.