-
-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Listing calls & URL size limit of OpenML server #468
Comments
Just realized that there is a straightforward fix for this, i.e., allowing also post variables for the listing functions (although POST means authentication) @mfeurer what do you think? |
Is the maximum URL length documented somewhere? Also, what's the server response? If the server response is something we can parse/display nicely, is there a reason to act upon this in addition? Also, how would you handle this except raising an exception? Do you want to "parse" the query on the python side to chunk it? We could go back to post again, but we reverted that a while ago to not have to authenticate, so I don't think that's a great idea. Maybe we can catch a server exception 'URL too long' and then try a post request? |
I think this issue will mostly occur with setups, as these can not be filtered by other means (tasks, runs, etc) I added a function to the openml contrib library: |
This can be reproduced with: import openml
openml.datasets.list_datasets(
data_id=list(range(10000))
) @janvanrijn we can add a helper to either call the get or post request depending on the length of the URI. However, to do so we need to know the maximal length the server allows. |
The OpenML server has a URL size limit, which prevents us from doing listing calls with too many filters. For example, when I want to filter on 1000 task id's it is likely that the call will fail. It would be great if we can somehow automatically detect and catch this in the list all function. However, this is a though problem, as sometimes there are multiple filters (e.g., per task and per setup)
Extending this limit is not a legitimate fix, as the problem will re-occur when using bigger filters.
The text was updated successfully, but these errors were encountered: