-
Notifications
You must be signed in to change notification settings - Fork 258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Accept pathlib.Path objects where filenames are accepted #610
Conversation
For python >= 3.6, objects representing paths now have the option to have a dunder `__fspath__` (See PEP 519 for more info) to return the string (or bytes) representation of the path. Pathlib on 3.6 includes this method so we should check for it first. Then, for pathlib objects from the py 3.4-3.5x, `__fspath__` is not available, so we dip into the py3k module to get the current python version's (2 or 3) string method (`unicode` in this case). So overall we're allowing filename to be a path object, but ensuring that it will be converted to the string/bytes representation of the path before passing it on to `image_klass.from_filename` My only reservations is that `nib.load` lowest entry point for this type of handling, because if some one really wants to call `from_filename` from an image_klass directly, they won't have this pathlib compatibility- however I think that `nib.load` is a very common entry point, which justifies the placement.
☔ The latest upstream changes (presumably #611) made this pull request unmergeable. Please resolve the merge conflicts. |
Why is the PR unmergeable? |
It cannot be merged because there are conflicts with the master branch - see the web interface for this PR for links. |
@CRiddler, do you think you can fix the conflicts? |
Codecov Report
@@ Coverage Diff @@
## master #610 +/- ##
=========================================
+ Coverage 90.1% 90.31% +0.2%
=========================================
Files 96 96
Lines 11914 12171 +257
Branches 2125 2127 +2
=========================================
+ Hits 10735 10992 +257
Misses 834 834
Partials 345 345
Continue to review full report at Codecov.
|
The conflicts have been resolved, however as stated earlier- this approach to resolving pathlib objects is not final nor foolproof. Essentially, we are checking if the |
I'm setting this milestone to 3.0. Once we're only dealing with Python 3.5+, that reduces the range of behaviors we need to accommodate. If you want to try for 2.5 and support Python 2's |
Hi @CRiddler, @fepegar. By pushing off to Python3-only, I think we can be a lot dumber about this. I would say the logic can simply be: filename = pathlib.Path(filename) If we place this at the head of Sound like a reasonable approach? We would also want to add some tests where we pass in valid and invalid strings, paths, path-like objects, and non-path-likes. |
Sounds reasonable to me. |
Hello @CRiddler, Thank you for updating! Cheers! There are no style issues detected in this Pull Request. 🍻 To test for issues locally, Comment last updated at 2019-10-23 15:31:04 UTC |
Hi @CRiddler, we're now requiring Python 3.5.1+, so we can resume work on this with simpler assumptions about the sort of objects we're going to be passed. I merged in master to resolve conflicts and get you going, but let me know how you want to proceed here. |
Hi @effigies, if we do want to convert everything into |
Hmm. I haven't really scoped out where all we will need to make modifications to use But if we want to take the Pandas approach, the entry-points are still the first step, so I don't see these as competing approaches. If anything the Pandas-like one would be a good first pass, and then we could evaluate if we want to take the plunge with internal As a note, it looks like Pandas has simplified their check: https://github.com/pandas-dev/pandas/blob/325dd686de1589c17731cf93b649ed5ccb5a99b4/pandas/io/common.py#L131-L160 |
I agree in that coercing everything to strings at the entry points will be the easiest way to go. The updated pandas method looks like it'll work perfectly for us since I will update the branch with a function similar to |
Cool. We have external licenses in COPYING, IIRC, so that and a link to the original code (with specific revision) should suffice. Our licenses are compatible. |
@CRiddler Just a bump on this one. Let me know if there's anything I can do to help. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good. Some suggestions on paths to guard.
And we should also verify that we can pass pathlib.Path
s to each method that's guarded. One good place would be here:
nibabel/nibabel/tests/test_image_api.py
Lines 146 to 159 in 1d91b72
fname = 'an_image' + self.standard_extension | |
img.set_filename(fname) | |
assert_equal(img.get_filename(), fname) | |
assert_equal(img.file_map['image'].filename, fname) | |
# to_ / from_ filename | |
fname = 'another_image' + self.standard_extension | |
with InTemporaryDirectory(): | |
img.to_filename(fname) | |
rt_img = img.__class__.from_filename(fname) | |
assert_array_equal(img.shape, rt_img.shape) | |
assert_almost_equal(img.get_fdata(), rt_img.get_fdata()) | |
# get_data will be deprecated | |
assert_almost_equal(img.get_data(), rt_img.get_data()) | |
del rt_img # to allow windows to delete the directory |
I would just make it loop, e.g.:
fname = 'an_image' + self.standard_extension
for path in (fname, Path(fname)):
...
And here, as well:
nibabel/nibabel/tests/test_image_load_save.py
Lines 256 to 257 in 1d91b72
nils.save(img, fname) | |
rt_img = nils.load(fname) |
And also in https://github.com/nipy/nibabel/blob/master/nibabel/tests/test_loadsave.py.
Finally, all functions/methods that accept os.PathLike
objects should have their docstrings updated.
@CRiddler Just a bump on this. Let me know if there's anything I can help with. I'm going to aim to feature freeze 3.0 on October 28 (three weeks from Monday) so we can have a decent RC period. |
Yep! This is still on my radar, classes started for me this past week so I've been a little busy. This is the first on my to-do list once I have a bit of free time, so I should have it done over the weekend. |
I mean... also enjoy your weekend. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good. Some minor fixes and cleanups.
Also from one of the tests:
======================================================================
ERROR: autogenerated test from validate_filenames
----------------------------------------------------------------------
Traceback (most recent call last):
File "c:\hostedtoolcache\windows\python\3.5.4\x64\lib\site-packages\nose\case.py", line 198, in runTest
self.test(*self.arg)
File "c:\hostedtoolcache\windows\python\3.5.4\x64\lib\site-packages\nibabel\tests\test_api_validators.py", line 20, in meth
validator(self, imaker, params)
File "c:\hostedtoolcache\windows\python\3.5.4\x64\lib\site-packages\nibabel\tests\test_image_api.py", line 146, in validate_filenames
img.set_filename(path)
File "c:\hostedtoolcache\windows\python\3.5.4\x64\lib\site-packages\nibabel\filebasedimages.py", line 255, in set_filename
self.file_map = self.__class__.filespec_to_file_map(filename)
File "c:\hostedtoolcache\windows\python\3.5.4\x64\lib\site-packages\nibabel\freesurfer\mghformat.py", line 533, in filespec_to_file_map
if splitext(filespec)[1].lower() == '.mgz':
File "c:\hostedtoolcache\windows\python\3.5.4\x64\lib\ntpath.py", line 224, in splitext
return genericpath._splitext(p, '\\', '/', '.')
File "c:\hostedtoolcache\windows\python\3.5.4\x64\lib\genericpath.py", line 118, in _splitext
sepIndex = p.rfind(sep)
AttributeError: 'WindowsPath' object has no attribute 'rfind'
It looks like MGHImage
has its own implementation of filespec_to_file_map
that needs to be updated.
Hmm. Looks like your history got a bit messed up there. I'd try rebasing onto fbaeacc. Should be pretty clean, as it's the merge commit that seems to be causing damage. |
Hopefully that fixed, I always struggle when it comes to wrestling with git history. If it needs some more reworking, I may need some hand-holding to resolve the history issue. |
Nope. So the easy way to do this is just: git rebase -i fbaeacc That will open an editor, probably
That should be fine, so save and exit. Now you'll see:
If you run
So edit
You may be dropped into an editor again, giving you a chance to modify the commit message. You can just save and exit, to leave it unchanged. Then you might see:
In this case, there's a commit that had no effect. So you can just
When you've iterated through all of the necessary changes:
Since your GitHub branch has a lot of old stuff in it, instead of:
I would suggest just comparing to the commit you rebased against:
If that looks reasonable, go ahead and force-push:
|
6374ede
to
e98dbfc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Two remaining comments that may have gotten lost in the mix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a couple style issues (see #610 (comment)). And one small fix:
Great! Thanks for this. |
Yay! Thank you guys for having worked on this. |
For python >= 3.6, objects representing paths now have the option to have a dunder
__fspath__
(See PEP 519 for more info) to return the string (or bytes) representation of the path. Pathlib on 3.6 includes this method so we should check for it first. Then, for pathlib objects from the py 3.4-3.5x,__fspath__
is not available, so we dip into the py3k module to get the current python version's (2 or 3) string method (unicode
in this case). So overall we're allowing filename to be a path object, but ensuring that it will be converted to the string/bytes representation of the path before passing it on toimage_klass.from_filename
My only reservations is that
nib.load
lowest entry point for this type of handling, because if some one really wants to callfrom_filename
from an image_klass directly, they won't have this pathlib compatibility- however I think thatnib.load
is a very common entry point, which justifies the placement.