Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document a module inclusion policy #80

Closed
woodruffw opened this issue Jun 21, 2023 · 3 comments · Fixed by #119
Closed

Document a module inclusion policy #80

woodruffw opened this issue Jun 21, 2023 · 3 comments · Fixed by #119
Assignees

Comments

@woodruffw
Copy link
Member

The automated module list generation (e.g. #77, #78, #79) works, but it reveals some weaknesses:

  1. For older Python versions it uses the Sphinx inventory, meaning that modules that are no longer listed (or are listed in subtly different ways) appear to "disappear" when the list is updated;
  2. For all Python versions it uses the latest minor version, which may have its own (intentional) variations;
  3. It's fundamentally tied to the Python build itself, meaning that (currently) it won't correctly detect modules that are only defined on non-Linux hosts.

I'm not sure if there's a good way to solve all of these, but one thing we can do: rather than replacing the old list on each update, we can instead merge it, resulting in a list for each Python version that effectively represents the "superset" of everything seen so far.

The upside to this is that our lists will appear more complete; the downside is that it means that some modules will be "inaccurately" listed for a particular Python version, since the end user's host or minor version may not have those modules. On the other hand, it was never guaranteed that these names could be imported to begin with, so maybe that downside isn't so serious.

CC @miketheman for thoughts 🙂

@woodruffw woodruffw self-assigned this Jun 21, 2023
@miketheman
Copy link
Member

The upside to this is that our lists will appear more complete; the downside is that it means that some modules will be "inaccurately" listed for a particular Python version, since the end user's host or minor version may not have those modules.

I think this comes to the crux of the purpose of this library.

If the problem it's trying to answer is "is this module name in my current Python version?" then the new methods in 3.10 and onwards would address that for the given runtime, and we in theory could look at a backport package that implements a similar behavior that generates this list.

However, in my head this library serves as a tool that is "frozen in time" and provides the static list of all potential names that were included in a given version, so merging (and never removing) seems like the right approach to me.

  • For all Python versions it uses the latest minor version, which may have its own (intentional) variations;

This is an interesting edge case - do you see additions or removals? If additions, I think that's fine - I'd expect 3.6.5 to include any new modules introduced over 3.6.4, but if modules are removed before a major release, that seems like an odd smell to me. I don't think we'll solve that here, but something to keep an eye out for future releases.

Overall, I like the merge idea, since then we can consider a Windows/macOS build run that adds whatever modules are detected/seen on those platforms as well.

@woodruffw
Copy link
Member Author

However, in my head this library serves as a tool that is "frozen in time" and provides the static list of all potential names that were included in a given version, so merging (and never removing) seems like the right approach to me.

This was also my conception of the library, so I think we're aligned here 🙂 -- IMO this library is much more useful as a list of everything that has been in a Python release rather than everything that is in a particular patch release (not in the least for PyPI's own usage).

do you see additions or removals? If additions, I think that's fine - I'd expect 3.6.5 to include any new modules introduced over 3.6.4, but if modules are removed before a major release, that seems like an odd smell to me. I don't think we'll solve that here, but something to keep an eye out for future releases.

Removals as well, unfortunately -- it looks like some of the negative diff comes from the fact that we collect test.* and *.test.*, which aren't considered part of the stdlib's public API and thus change a bit with each bump. Similarly we regressed on collecting idlelib.* (since it's intentionally no longer documented as a public part of the stdlib), so that's a big part of the negative diff for older versions.

Overall, I like the merge idea, since then we can consider a Windows/macOS build run that adds whatever modules are detected/seen on those platforms as well.

Cool! I'll refactor the workflow to do merges rather than overwrites, then.

@woodruffw
Copy link
Member Author

I've made an initial stab at this with #119.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants