Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: module inclusion policy #119

Merged
merged 2 commits into from
Apr 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
extensions = [
"sphinx.ext.autodoc",
"sphinx.ext.viewcode",
"sphinx.ext.intersphinx",
]

html_theme = "furo"
Expand Down
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ Contents

install
usage
module-policy


Indices and tables
Expand Down
49 changes: 49 additions & 0 deletions docs/module-policy.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
Module inclusion policy
=======================

Python is a dynamic language with a complex module system, including
modules that are created only at runtime or appear on specific
supported platforms.

This page exists to document ``stdlib-list``'s approach to module detection
and subsequent inclusion. It is not intended to be permanent, and may change
over time as Python itself changes (or our approach to module detection
improves).

Current guiding rules
---------------------

* Missing top-level modules **are a bug**: if a new version of Python adds a new
top-level module, our failure to detect it should be considered a bug.

Concretely: if ``examplemodule`` is present in Python 3.999, then it should be
included in the ``stdlib_list("3.999")`` listing.

* Missing sub-modules are **best-effort**: if ``examplemodule`` contains
``examplemodule.foo.bar.baz.deeply.nested``, we make a best-effort attempt
to detect each inner module but make no guarantee about doing so.

Our rationale for this is that "stdlib-ness" is inherited from the parent
module, even when not explicitly listed. In other words: anything that matches
``examplemodule.*`` is in the standard library by definition so long
as ``examplemodule`` is in the standard library.
woodruffw marked this conversation as resolved.
Show resolved Hide resolved

* Platform-specific modules are **best-effort**: ``stdlib-list`` is currently collected
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NB: This rule reflects our current practice, but maybe we should change that. In particular, it probably wouldn't be too hard to collect modules on Windows and macOS as well in our current listgen workflow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this is inline with current practice.

If we can expand platform support with automation, then we should add it.

Maybe this is more like Tier 1 is Linux, Tier 2 is Win/Mac?

Sets the stage for what this library finds most important.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be a big mistake to not include windows and mac as tier 1 platforms, more developers are probably on those platforms than on linux (SO dev survey ). It's also easy to run github pipeline jobs on those platforms...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thebjorn the tiers here are intended to be descriptive, not prescriptive -- IMO Windows and macOS should indeed be Tier 1, but currently aren't. So the policy as-merged here should probably document them as Tier 2 until someone puts the work into making the listgen workflow work across those as well 🙂

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked at the workflow in any real depth, but would it just boil down to explicitly setting shell: bash and either using an OS matrix or a specific set of includes to get win/mac on-board? (like line 61-73 here: https://github.com/thebjorn/pydeps/blob/master/.github/workflows/ci-cd.yml#L61).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd envision changing https://github.com/pypi/stdlib-list/blob/c3a45e824174fb298bba17cb3afb04c0ba9953a3/.github/workflows/listgen.yml to have a more fan-out kind of step, where after a pre-list is done, fan out to each platform, generate on that platform, upload the artifact, and then a final step to download all of the generated files for that version, and combine them all to the final versioned list.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... looks like there is a makefile (why?), creation of a virtualenv (why do this in a pipeline that is isolated to a specific python version?), and hard coding of the .env/bin/python path to work in the created virtualenv. Aside from that there should only be changes corresponding to what I've done here: main...thebjorn:stdlib-list:main

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

creation of a virtualenv (why do this in a pipeline that is isolated to a specific python version?),

I did this for two reasons:

  1. A venv here ensures that we don't load in the system site packages by default. This is intended as a defensive maneuver: nothing stops the OS Python distribution from modifying the stdlib directly, but we can at least perform a small amount of site isolation from anything in the system site packages.
  2. I think environment isolation is a good general practice, especially following PEP 668. GitHub runners are arguably their own ephemeral environment anyways, but creating a venv is cheap and reduces the amount of global state to think about.

(I have no good argument for the Makefile. It's just what I'm used to.)

from Linux builds of CPython. This means that Windows- and macOS-specific modules
(i.e., modules that aren't installed except for on those hosts) are not necessarily
included.

This includes top-level modules.

* Missing non-CPython modules are **not supported**: ``stdlib-list`` is implicitly
a list of CPython's standard library modules, which are expected to be mirrored
in other implementations of Python.

* Psuedo-modules are **not supported**: Python sometimes makes use of
"pesudo-modules", i.e. namespaces placed into ``sys.modules`` that don't
pass :py:func:`inspect.ismodule`. We don't currently support these, since the
semantics for doing so are unclear.
See `stdlib-list#117 <https://github.com/pypi/stdlib-list/issues/117>`__
for additional details.
woodruffw marked this conversation as resolved.
Show resolved Hide resolved

If you have a scenario not covered by the rules above, please file an issue!