-
-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use natural sorting in imread(...)
when globbing multiple files
#265
Conversation
Can one of the admins verify this patch? |
add to allowlist |
Do you anticipate needing to use a sort function besides natural sort? I'd be happy to straight up replace I'm less keen to change the API to allow users to pass in arbitrary sorting functions, although I see why you suggest it. Instead, I think it would be cleaner to allow the function to accept a list of strings or Paths. Then the user could modify this list in any arbitrary way before providing it to dask-image, and it's much easier for the user to inspect. Ideally, we'd do both these things. Then there would be a sensible default with glob and natsort, plus a more flexible alternative if necessary. |
Oh, even easier! I didn't realize |
I agree that natural sort would be the most sensible default. Yes, |
How about you edit this PR to remove the API change, and instead do a straight replacement of the current behaviour with The current behaviour is clearly a bug, and it is very unlikely anybody is relying on it for legacy code. I'd also be happy to see a PR allowing users to pass in a list of filenames, but that should be a separate PR. (You're welcome to try making that too if you like, but I also don't expect you to. Your problem will likely be solved after this first PR and you'll want to get back to what you were doing earlier)
I have seen that documentation in the source code, perhaps it's time to go back and see what other cool stuff I missed! |
I made natural sorting the default and removed the ability to pass in a custom sort function as expected. As side notes (probably better placed in two separate issues):
|
sortfunc
argument to imread(...)
to customize glob orderimread(...)
when globbing multiple files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @VolkerH, this is a useful improvement
Yes, this is perhaps better discussed in #229 Minor differences include:
|
dask_image.imread.imread
can include glob strings in the filename argument.Currently, the results of the glob operation are sorted using
sorted
, a behviour that is not described in the doctring.sorted
sorts alphabetically which causes files such as10.tif
,9.tif
to be read in this order, which is counter-intuitive.A common strategy adopted by users to circumenvent such file name problems is to zero-pad all numbers in a string.
This PR adds a
sortfunc
argument to toimread
, which allows to pass in a custom sort function. A use case for this is to pass in a function that performs natural sort, which will sort strings of numbers numerically. Such a use-case is shown in the included test.The docstring for imread has been updated to include the new argument, which also remedies the current undocumented behaviour.
Files have been checked with
flake8
.