Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Double the speed of zfec #114

Merged
merged 5 commits into from
Nov 15, 2024
Merged

Double the speed of zfec #114

merged 5 commits into from
Nov 15, 2024

Conversation

itamarst
Copy link

@itamarst itamarst commented Nov 15, 2024

Fixes #102

  • Twice as fast.
  • No longer supports 3.8 (possibly not actually necessary, I was thinking there was manylinux issues but could be wrong. In practice 3.8 is EOL anyway for security updates, so meh).
  • Won't install on Python 3.9.0-3.9.5 unless they update pip in a virtualenv... in which case pip will just install the previous release, so not actually a problem for users.
  • No longer supports Ubuntu 18.04 (but it's EOL too).

Copy link
Collaborator

@meejah meejah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@meejah meejah merged commit dedbae7 into tahoe-lafs:master Nov 15, 2024
46 of 47 checks passed
@itamarst itamarst deleted the 102-speed-up branch November 15, 2024 17:22
@hacklschorsch
Copy link
Member

hacklschorsch commented Dec 4, 2024

I have machines older than this ; what's the best way to run tahoe-lafs on them?

@meejah
Copy link
Collaborator

meejah commented Dec 4, 2024

Can you be more specific about the machines in question?

(The generic answer would be "use a ZFEC library before version 1.6.0.0").

@hacklschorsch
Copy link
Member

hacklschorsch commented Dec 5, 2024 via email

@hacklschorsch
Copy link
Member

hacklschorsch commented Dec 5, 2024

I guess something like hwcaps checking is too much hassle? https://www.theregister.com/2022/12/16/tumbleweed_reverses_x864v2_plan/

The hwcaps feature in glibc allows detection and manipulations of the hardware capabilities of chips in various CPU families

@hacklschorsch
Copy link
Member

https://pypi.org/project/mwa-hyperbeam/ say they offer multiple wheels targeting different microarchitecture levels:

What are these different x86-64 versions?

They are microarchitecture levels. By default, Rust compiles for all x86-64 CPUs; this allows maximum compatibility, but potentially limits the runtime performance because many modern CPU features can't be used. Compiling at different levels allows the code to be optimised for different classes of CPUs so users can get something that works best for them.

Looking at the download files for their latest version I don't see how they do it though

@itamarst
Copy link
Author

itamarst commented Dec 5, 2024

There are multiple ways to handle different CPU microarchitectures:

  1. Use oldest possible CPU target; this is the default.
  2. Use a commonly-used CPU target, what we tried to do in this PR, which I guess is a problem.
  3. Runtime dispatch inside the function, which is supported by C/C++/Rust and probably other languages, where you (a) compile multiple versions of a function and (b) choose one at runtime. I imagine this is annoying in C.
  4. Import time! This is a fun Python-specific one, where you ship multiple copies of the extension compiled against different targets, and then at import time check in Python what CPU you have, and then import the appropriate one for the current CPU. I have prototyped this.

For zfec... maybe easier just to revert this change for now?

@hacklschorsch
Copy link
Member

hacklschorsch commented Dec 8, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Try switching to -O3, see if it speeds things up
4 participants