Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not possible to use bottleneck as a dependency on linux arm64 with rules_python 0.37.0 #2328

Closed
mark-thm opened this issue Oct 22, 2024 · 16 comments · Fixed by #2360
Closed

Comments

@mark-thm
Copy link
Contributor

🐞 bug report

Affected Rule

whl_library

Is this a regression?

Yes, last functioned in rules_python 0.34.0.

Description

It's not possible to use bottleneck as a dependency on linux arm64 with rules_python 0.37.0

🔬 Minimal Reproduction

https://github.com/theoremlp/rules_python_repro

🔥 Exception or Error


INFO: Repository rules_python~~pip~pip_312_bottleneck_sdist_fa8e8e17 instantiated at:
  : in 
Repository rule whl_library defined at:
  /home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~/python/private/pypi/whl_library.bzl:457:30: in 
ERROR: An error occurred during the fetch of repository 'rules_python~~pip~pip_312_bottleneck_sdist_fa8e8e17':
   Traceback (most recent call last):
	File "/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~/python/private/pypi/whl_library.bzl", line 245, column 35, in _whl_library_impl
		repo_utils.execute_checked(
	File "/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~/python/private/repo_utils.bzl", line 212, column 29, in _execute_checked
		return _execute_internal(fail_on_error = True, *args, **kwargs)
	File "/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~/python/private/repo_utils.bzl", line 143, column 20, in _execute_internal
		logger.fail((
	File "/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~/python/private/repo_utils.bzl", line 85, column 39, in lambda
		fail = lambda message_cb: _log(-1, "FAIL", message_cb, fail),
	File "/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~/python/private/repo_utils.bzl", line 75, column 16, in _log
		printer("\nrules_python:{} {}:".format(
Error in fail: 
rules_python:whl_library(@@rules_python~~pip~pip_312_bottleneck_sdist_fa8e8e17) FAIL: repo.execute: whl_library.BuildWheelFromSource(rules_python~~pip~pip_312_bottleneck_sdist_fa8e8e17, bottleneck==1.4.2): end: failure:
  command: /home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~~python~python_3_12_host/python -m python.private.pypi.whl_installer.wheel_installer --requirement bottleneck==1.4.2 --isolated --extra_pip_args "{\"arg\":[\"--no-build-isolation\",\"--find-links\",\".\"]}" --pip_data_exclude "{\"arg\":[]}" --enable_implicit_namespace_pkgs --environment "{\"arg\":{}}"
  return code: 1
  working dir: 
  timeout: 600
  environment:
PYTHONPATH="/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~:/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~~internal_deps~pypi__build:/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~~internal_deps~pypi__click:/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~~internal_deps~pypi__colorama:/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~~internal_deps~pypi__importlib_metadata:/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~~internal_deps~pypi__installer:/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~~internal_deps~pypi__more_itertools:/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~~internal_deps~pypi__packaging:/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~~internal_deps~pypi__pep517:/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~~internal_deps~pypi__pip:/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~~internal_deps~pypi__pip_tools:/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~~internal_deps~pypi__pyproject_hooks:/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~~internal_deps~pypi__setuptools:/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~~internal_deps~pypi__tomli:/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~~internal_deps~pypi__wheel:/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~~internal_deps~pypi__zipp"
CPPFLAGS="-isystem /home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~~python~python_3_12_host/include/python3.12"
===== stdout start =====
Looking in links: .
Collecting bottleneck==1.4.2 (from -r /tmp/tmpykoshwh7 (line 1))
  File was already downloaded /home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~~pip~pip_312_bottleneck_sdist_fa8e8e17/bottleneck-1.4.2.tar.gz
  Preparing metadata (pyproject.toml): started
  Preparing metadata (pyproject.toml): finished with status 'error'
===== stdout end =====
===== stderr start =====
        File "/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~~internal_deps~pypi__setuptools/setuptools/command/egg_info.py", line 321, in run
          self.find_sources()
        File "/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~~internal_deps~pypi__setuptools/setuptools/command/egg_info.py", line 329, in find_sources
          mm.run()
        File "/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~~internal_deps~pypi__setuptools/setuptools/command/egg_info.py", line 550, in run
          self.add_defaults()
        File "/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~~internal_deps~pypi__setuptools/setuptools/command/egg_info.py", line 588, in add_defaults
          sdist.add_defaults(self)
        File "/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~~internal_deps~pypi__setuptools/setuptools/command/sdist.py", line 102, in add_defaults
          super().add_defaults()
        File "/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~~internal_deps~pypi__setuptools/setuptools/_distutils/command/sdist.py", line 250, in add_defaults
          self._add_defaults_ext()
        File "/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~~internal_deps~pypi__setuptools/setuptools/_distutils/command/sdist.py", line 334, in _add_defaults_ext
          build_ext = self.get_finalized_command('build_ext')
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~~internal_deps~pypi__setuptools/setuptools/_distutils/cmd.py", line 303, in get_finalized_command
          cmd_obj.ensure_finalized()
        File "/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~~internal_deps~pypi__setuptools/setuptools/_distutils/cmd.py", line 111, in ensure_finalized
          self.finalize_options()
        File "", line 76, in finalize_options
      ModuleNotFoundError: No module named 'numpy'
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
Traceback (most recent call last):
  File "", line [19](https://github.com/theoremlp/rules_python_repro/actions/runs/11463773942/job/31898410925#step:5:20)8, in _run_module_as_main
  File "", line 88, in _run_code
  File "/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~/python/private/pypi/whl_installer/wheel_installer.py", line [20](https://github.com/theoremlp/rules_python_repro/actions/runs/11463773942/job/31898410925#step:5:21)5, in 
    main()
  File "/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~/python/private/pypi/whl_installer/wheel_installer.py", line 190, in main
    subprocess.run(pip_args, check=True, env=env)
  File "/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d624/external/rules_python~~python~python_3_12_aarch64-unknown-linux-gnu/lib/python3.12/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/home/runner/.cache/bazel/_bazel_runner/8266b8631ad340f3d61644857ec7d6[24](https://github.com/theoremlp/rules_python_repro/actions/runs/11463773942/job/31898410925#step:5:25)/external/rules_python~~python~python_3_12_host/python', '-m', 'pip', '--isolated', 'wheel', '--no-deps', '--no-build-isolation', '--find-links', '.', '-r', '/tmp/tmpykoshwh7']' returned non-zero exit status 1.
===== stderr end =====
Analyzing: 142 targets (6 packages loaded, 1540 targets configured)
[1 / 1] checking cached actions
ERROR: Analysis of target '@@rules_python~~pip~pip//bottleneck:whl' failed; build aborted: Analysis failed
INFO: Elapsed time: 6.453s, Critical Path: 0.01s
INFO: 1 process: 1 internal.
ERROR: Build did NOT complete successfully

🌍 Your Environment

Operating System:

  
linux, arm64
  

Output of bazel version:

  
Bazelisk version: 1.22.0
Build label: 7.3.2
Build target: @@//src/main/java/com/google/devtools/build/lib/bazel:BazelServer
Build time: Tue Oct 1 17:48:21 2024 (1727804901)
Build timestamp: 1727804901
Build timestamp as int: 1727804901
  

Rules_python version:

  
0.37.0
  

Anything else relevant?

Related:

@aignas
Copy link
Collaborator

aignas commented Oct 24, 2024

It looks like it is working as intended to be honest. If the sdist needs numpy to be successfully built, then it needs to find it somewhere. If numpy is not installed somewhere where it can leak into the build process of the wheel then, there is no way for us to take it from the requirements.txt, because the build of the sdist is separate from the rest of the requirements.txt file here.

@mark-thm
Copy link
Contributor Author

Could you help me understand a bit more about why this works in rules_python 0.34.0 but no longer works in later versions? I've been tracking the various wheel arg changes but tbh I skipped over understanding the more fundamental changes shipped in 0.35.0 with whl_library builds.

Separately, I'm at a bit of a loss for how to move forward -- it seems like rules_python will be generally incompatible with any sdists that have dependencies outside of the blessed set of build deps that rules_python normally brings in, which leaves the options as something like:

  • figure out how to build this sdist outside of Bazel, publish it somewhere, and convince rules_python to fetch it from there
  • figure out how to get the main build of whatever dependencies are incompatible with rules_python to publish wheels for additional os/arch combinations
  • switch to rules_pycross

Something here will probably work for me, for now, but IMO this is really raising the bar for adopting Bazel for Python.

@mark-thm
Copy link
Contributor Author

Alternatively, I'm able to produce a wheel successfully if instead of running pip wheel I run pip install (which will download, and, if necessary, build numpy), and a wheel is produced as a byproduct. That seems like it might produce a more intuitive result for cases like this.

@rickeylev
Copy link
Contributor

If a C extension is being created, then the standard advice is that the project should be providing a wheel. Building C extension from source can be quite particular to projects.

Otherwise, I think the other options are:

  • Provide a way to inject additional dependencies into an sdist build, or otherwise have more control over how an sdist is built -- i.e. something in pip.override()
  • Provide a way so that the generated code can plumb through the install-time dependencies, so that the later sdist building can work. I'm not sure if this can be made to Just Work. Best I can think of is if a bzlmod extension is given the sdist url, downloads the sdist, inspects it, and can then pass along the install-time packages to the later "regular" pypi codegen. (You can see how this might get messy quickly).

@groodt
Copy link
Collaborator

groodt commented Oct 25, 2024

Separately, I'm at a bit of a loss for how to move forward -- it seems like rules_python will be generally incompatible with any sdists that have dependencies outside of the blessed set of build deps that rules_python normally brings in

I'm actually interested in how it was working at all in the first-place myself. It does indicate some sort of non-hermetic scenario in previous versions which have now been made to be hermetic...

@groodt
Copy link
Collaborator

groodt commented Oct 25, 2024

Alternatively, I'm able to produce a wheel successfully

What you can do in this instance @mark-thm , depending on your specific environment and supply chain requirements. Is to upload the wheel to a github release and use direct_url references.

So something like:
bottleneck @ https://github.com/foo/bar/releases/download/v2.6.3/bottleneck-1.4.2-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl#sha256=d6f27c61f640ce0c4e4eb140eb7e53e57e48d87e7a336e0e31c777f5ca13e844

(I just guessed the wheel tags in the example. They might be different based on your specific ABI etc)

@groodt
Copy link
Collaborator

groodt commented Oct 25, 2024

The python ecosystem is looking to partially solve some of this in the lockfile standard, but it gets quite gnarly. Different sdist might have different build-time dependencies for example. So you might need to be provided multiple independent build environments. And then some build environments take dependencies on the exec platform if they link out to OS libraries etc. So it becomes quite the rabbithole at the limit.

The good news is that the vast majority of major python dependencies are already wheels: https://pythonwheels.com/

And that tools like https://cibuildwheel.pypa.io/en/stable/ exist to help project authors package wheels for most popular platforms.

sdist are inevitable though, but the ones that aren't pure-python are becoming more and more rare in my experience.

@mark-thm
Copy link
Contributor Author

bottleneck already uses cibuildwheel, but hasn’t set up emulation to publish arm64 wheels because the CI system doesn’t run on arm64 (because GitHub doesn’t yet provide OSS Linux/arm64 runners). I’ve made a PR to add wheel builds but the maintainers (very reasonably) want to also run tests. That’s a bit trickier, I’m not sure there’s a great way to do this for low maintenance cost and for free.

I think what feels tricky about rules_python’s current behavior here is a couple of things:

  1. The failure is cryptic, requiring a pretty nuanced understanding of what’s going on
  2. pip installing this dependency works fine

Thus the state of the world is that nobody has done anything wrong (you’re saying expected behavior, dependency maintainer has correctly encoded build time dependencies, pip wheel is behaving as expected), and yet my build is failing.

But I do think there’s a broken assumption here: pip wheel demands that you parse the pyproject.toml file and supply dependencies, and that’s not what’s happening here.

pip install is a viable alternative that will fetch and build missing dependencies for the build, but that wont use the Bazel downloader (and it seems like rules_python is trying to take control of that interaction).

Separately: yes, I can build all my deps into wheels and publish these things somewhere—my normal build orchestration tool is Bazel, so the task ahead is to make something like what rules_python does: parse a requirements files, download sdists, build those sdists using pip, and then also publish those files somewhere.

A couple of ideas:

  • instead of parsing pyproject.toml files, support a non-hermetic option of using pip install in place of pip wheel, allowing it to download and build deps (this can be done somewhat sandboxed at the cost of possibly exponentially many builds of common deps)
  • build towards a build graph of deps and a runtime graph of deps, where the build graph wires up the actual build deps as parsed from sdists and the runtime graph is the declared wheel level deps (this seems like a massive project, though)

@groodt
Copy link
Collaborator

groodt commented Oct 25, 2024

Yes, that’s a good summary on the issue and possible solutions. We’ve had many similar discussions on issues and as maintainers.

  1. The failure is cryptic, requiring a pretty nuanced understanding of what’s going on

It’s also a reflection of the current python packaging ecosystem realities. Even outside bazel the community struggles with these similar issues. Your upstream PR demonstrates that it isn’t even clear if upstream bottleneck is supported on arm64 by the maintainers. It’s nobody’s fault, rather the tragedy of the commons.

The python packaging community has considered disabling sdist by default as a mitigation. Thats the origin of the message “this is likely not a problem with pip”. A similar framing could be taken “this is likely not a problem with bazel/rules_python” (But I’m still really curious how this issue only occurred upon update of the rules. I’m not sure we got to the bottom of what changed?)

Emerging lockfile standards, wheel format improvements, index metadata, disabling sdists and installers like uv are all improving. rules_python is downstream of all these things and is also improving. These things just take time.

  1. pip installing this dependency works fine

Are we sure that “pip wheel” outside bazel fails too? I don’t think it will. If I had an arm64 machine I could test it. My understanding of the PEP 517 build hooks is that they are invoked for both pip install and pip wheel.

The problem here boils down essentially to: “running other people’s build makefiles”. So it’s a bit unrealistic to solve completely in the presence of non-hermetic, Turing complete systems. Stating that “pip install works” is the same as saying “works on my machine”.


I think as you’ve mentioned, the workarounds in order of effort for you are probably:

  • Upstream bottleneck project starts publishing wheels for your platform
  • You build a bottleneck wheel out of band and store the wheel somewhere accessible to your lockfile producing system and your build system. Tools like https://github.com/python-wheel-build/fromager or https://cibuildwheel.pypa.io/en/stable/ might be useful
  • Switch to alternative rules like rules_pycross or rules_py that may have better support for sdist at the moment
  • Build your own bazel integration for sdist (and consider contributing them to rules_python)

@mark-thm
Copy link
Contributor Author

mark-thm commented Oct 25, 2024

Are we sure that “pip wheel” outside bazel fails too?

It fails for me, yes: replicating the pip wheel command with args outside of Bazel using a Linux arm64 machine with a fresh venv and no system level deps installed results in an identical error.

Stating that “pip install works” is the same as saying “works on my machine”.

I’m not an expert here, but I disagree with this. In particular, I’m trying to call attention to the differences in behavior of pip wheel and pip install. What I observe:

  • pip wheel assumes something external has satisfied build dependencies, and then builds a wheel
  • pip install will analyze declared build system dependency information and then do the work required to make those dependencies available (by downloading or downloading and building build system dependencies)

@groodt
Copy link
Collaborator

groodt commented Oct 25, 2024

That’s a misunderstanding I believe. I’ll try a repro on a non arm64 machine to demonstrate. Both subcommands invoke build backends hooks. eg. They both offer “—no-build-isolation” arguments.

Indeed, before pip installs any sdist dependency, a wheel is always built first.

@groodt
Copy link
Collaborator

groodt commented Oct 25, 2024

Looking at the full wheel command executed in the trace, I think I know why pep517 build hooks aren’t pulling down the build dependencies declared in the bottleneck pyproject.toml.

The “—no-build-isolation” option is provided to the wheel command. Removing that will probably fix this sdist, but might break others.

@mark-thm
Copy link
Contributor Author

Ok, can confirm that removal of --no-build-isolation allows this wheel to build with pip wheel ... in which case I think we're still in a bind because it seems without that arg you don't pick up the build-time deps that Bazel supplies?

@groodt
Copy link
Collaborator

groodt commented Oct 27, 2024

in which case I think we're still in a bind because it seems without that arg you don't pick up the build-time deps that Bazel supplies?

Not necessarily. The majority of actively maintained projects should be using PEP-517 metadata to fully specify and configure their build dependencies (build-system.requires) and build backend (build-system.build-backend) e.g. flit, setuptools, hatchling, maturin etc. This lets the wheels build in isolated virtual environments. These are well-behaved builds. Not well-behaved in a bazel sense mind you, merely well-behaved in the sense that these sdist have a slightly better chance of a successful build on another machine because they declare some of their build dependencies.

However, there are also a lot of builds that aren't well-behaved. That is why build frontends like uv and pip provide parameters like --no-build-isolation. So that these non-well-behaved builds can build in the current environment (and hopefully the environment has the necessary dependencies for a successful build).


I've had a think and I think a good path forward to avoid disappointment, and to make expectations clear to users of the rules who have sdist in their dependency closures, might be to do the following:

  • Fail fast when encountering an sdist (and printing out a helpful error message that "sdist are opt-in and fragile. With links to docs on what to do."
  • Let users declare "Named build environments in bazel with a lockfile of dependencies". There can be 1 or more of these.
  • For an sdist to be enabled for best-effort build, it should have an entry that maps the name of the sdist to a named build environment
  • The build for the sdist then occurs with --no-build-isolation but inside the named build environment

So in this bottleneck instance, it might look like this:

Named build environment: bottleneck-build
Build dependencies: [setuptools, versioneer, wheel, numpy]
In MODULE.bazel something like:

pip.override(
  hub_name = "pypi",
  package = "bottleneck",
  sdist_build_environment = "@bottleneck-build",
)

Now, that might all be some future implementation. For now, I don't see a strong reason for not removing --no-build-isolation. I don't remember why it's the default. WDYT @aignas ?

@aignas
Copy link
Collaborator

aignas commented Oct 28, 2024

Yeah, removing --no-build-isolation for now would be better. Users can also opt-into that in other ways.

Regarding the named build envs, it is a nice sketch and could solve majority of the issues that we are facing these days.

@mark-thm
Copy link
Contributor Author

Pushed #2360

github-merge-queue bot pushed a commit that referenced this issue Oct 31, 2024
Fixes #2328.

In #2126, we discussed and added `--no-build-isolation` to ensure that
sdist builds pulled from rules_python supplied dependencies. An
unanticipated/misunderstood side-effect of this is that pip will no
longer resolve necessary, declared build time dependencies from a
package index. That's a significant point of friction for users, and,
it's much harder to opt-out of the behavior than to opt-in given the
presence of `extra_pip_args`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants