Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accurate overloads for ZipFile.__init__ #12119

Merged
merged 19 commits into from
Aug 11, 2024

Conversation

max-muoto
Copy link
Contributor

@max-muoto max-muoto commented Jun 8, 2024

Add overload cases based on each mode to eliminate false positives for types that don't fully implement IO[bytes], per #10880

@max-muoto max-muoto marked this pull request as ready for review June 8, 2024 21:39
@max-muoto max-muoto changed the title chore: Accurate overloads for ZipFile.__init__ Accurate overloads for ZipFile.__init__ Jun 8, 2024

This comment has been minimized.

@max-muoto
Copy link
Contributor Author

max-muoto commented Jun 8, 2024

Run-through of MyPy primer here:

Pip:

Removes false positive as the object they're passing in is readable, but doesn't implement IO[bytes]. They put a typing ignore, which is why this adds an additional warning.

werkzeug:

Removes false positive in test where a writable ResponseStream object is being passed into ZipFile.__init__, as the mode is set to "w".

Copy link
Member

@sobolevn sobolevn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would consider adding test cases for this overload. It is rather complex.

@max-muoto
Copy link
Contributor Author

I would consider adding test cases for this overload. It is rather complex.

Sounds good, can handle that tomorrow!

@max-muoto
Copy link
Contributor Author

I would consider adding test cases for this overload. It is rather complex.

Added tests, found some minor issues which I fixed as a result: 4cf07ab

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@max-muoto
Copy link
Contributor Author

@JelleZijlstra Just wanted to follow up on this one!

@JelleZijlstra
Copy link
Member

I'll periodically look over open typeshed PRs when I have some time, but can't guarantee I'll come back to a specific one (and other maintainers may also pick up this PR if they're confident enough in the change to merge it).

@max-muoto
Copy link
Contributor Author

I'll periodically look over open typeshed PRs when I have some time, but can't guarantee I'll come back to a specific one (and other maintainers may also pick up this PR if they're confident enough in the change to merge it).

Sounds good, thanks for the initial look nonetheless.

@overload
def __init__(
self,
file: StrPath | _ZipReadable | _ZipSeekTellable,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this right? I'm not sure _ZipSeekTellable is enough, e.g. _RealGetContents requires read

Copy link
Contributor Author

@max-muoto max-muoto Jul 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, we want a protocol that is seekable/readable if it's a valid zip file, otherwise it should implement tell: 0b52c6a

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think likely misread this, and assumed if an attribute error occurred it just needed to be tellable.

This comment has been minimized.

Comment on lines 126 to +136
def __init__(
self,
file: StrPath | IO[bytes],
mode: _ZipFileMode = "r",
compression: int = 0,
allowZip64: bool = True,
compresslevel: int | None = None,
*,
strict_timestamps: bool = True,
metadata_encoding: str | None = None,
) -> None: ...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't all following overloads shadowed by this one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No - for example, IO[bytes] is broader than _ZipReadable, which will lead to the second overload getting chosen if you just an object that is readable.

Copy link
Contributor Author

@max-muoto max-muoto Jul 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for _ZipFileMode, if file doesn't match we'll end up skipping over the overload. The tests show that this is working properly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IO[bytes] is broader than _ZipReadable

But that's exactly the problem. It's my understanding that overloads are processed in order (although the typing spec doesn't mention that), so IO[bytes] always matches before _ZipReadable can match.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opened python/typing#1803 for clarification.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is that a problem though? _ZipReadable will get matched if IO[bytes] isn't fully fulfilled, or am I misunderstanding something?

This comment has been minimized.

Copy link
Collaborator

@hauntsaninja hauntsaninja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks good, but had one more question!

) -> None: ...
@overload
def __init__(
self,
file: StrPath | IO[bytes],
mode: _ZipFileMode = "r",
file: StrPath | _ZipTellable | _ZipWritable,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While _ZipTellable won't immediately error, I don't think the resulting object can be used for anything right? Unless I'm missing some use case, I think we should keep it StrPath | _ZipWritable

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh in fact it's definitely wrong, because __del__ will trigger close will trigger a write attempt

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me fix this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hauntsaninja Should be resolved now.

Copy link
Contributor

Diff from mypy_primer, showing the effect of this PR on open source code:

pip (https://github.com/pypa/pip)
+ src/pip/_internal/network/lazy_wheel.py:162: error: Unused "type: ignore" comment  [unused-ignore]

werkzeug (https://github.com/pallets/werkzeug)
- tests/test_wrappers.py:1216: error: No overload variant of "ZipFile" matches argument types "ResponseStream", "str"  [call-overload]
- tests/test_wrappers.py:1216: note: Possible overload variants:
- tests/test_wrappers.py:1216: note:     def __init__(self, file: str | PathLike[str] | IO[bytes], mode: Literal['r'] = ..., compression: int = ..., allowZip64: bool = ..., compresslevel: int | None = ..., *, strict_timestamps: bool = ..., metadata_encoding: str | None) -> ZipFile
- tests/test_wrappers.py:1216: note:     def __init__(self, file: str | PathLike[str] | IO[bytes], mode: Literal['r', 'w', 'x', 'a'] = ..., compression: int = ..., allowZip64: bool = ..., compresslevel: int | None = ..., *, strict_timestamps: bool = ..., metadata_encoding: None = ...) -> ZipFile

@hauntsaninja
Copy link
Collaborator

Thank you!

@hauntsaninja hauntsaninja merged commit dc0b63f into python:main Aug 11, 2024
63 checks passed
max-muoto added a commit to max-muoto/typeshed that referenced this pull request Aug 17, 2024
max-muoto added a commit to max-muoto/typeshed that referenced this pull request Sep 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants