-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Manage with nix #727
base: master
Are you sure you want to change the base?
Manage with nix #727
Conversation
I consider this initiative by the Master first a blessing, then a must. Can we setup a small task force to help him?
… On 21 Jun 2020, at 16:48, Jacek Generowicz ***@***.***> wrote:
Why?
Our manage.sh environment management system is fragile, error-prone and rather user-unfriendly.
I would like to explore the possibility of replacing it with Nix <https://nixos.org/>.
There is much to say about Nix (lots of it good, some not so good), but from a very high-level perspective the most important points are that, if we can get this to work properly:
Simply cding into the IC directory would automatically initialize the appropriate development and execution environment.
Checking out a different commit should automatically update the environment to the one matching the checked out code.
The major downsides are:
There is currently no (sensible) way to install Nix without sudo. (But once you have installed it, installing any packages does not require admin rights.)
Nix is a VERY complex beast (but if we get it right, most of you will not see any of that complexity).
In summary ...
Setting this up properly will take consiberable effort, but once that is done, the result should be very pleasant indeed for anyone working in IC.
Please help
In my initial attempts, lots of tests are failing. Some of this is likely to be because the package versions specified in manage.sh are ancient, while I'm using up-to-date ones in the new config. As I have been out of touch with IC for quite some time, most of these failures are completely meaningless to me, so I'd appreciate if someone who is more in touch could have a look to see if there are any obvious solutions to any of the problems.
How can I help?
Install Nix on your machine. The process is described here <https://nixos.org/download.html>. It's approximately:
sudo curl -L https://nixos.org/nix/install | sh
Add . $HOME/.nix-profile/etc/profile.d/nix.sh to your shell configuration.
cd /path/to/IC
shell-nix
pytest
See if you can understand any of the test failures.
Looking forward
If we can get this to work, I would propose to maintain this branch alongside master to see how it deals with evolution of requirements, pinning of versions etc. for quite some time. If, after a while, it proves to work reliably, we could use it to replace the manage.sh abomination.
You can view, comment on, or merge this pull request online at:
#727 <#727>
Commit Summary
Remove use_coverage from conftest.py
Start work on shell.nix
Add outline of multi-python-version shell.nix
Run setup.py implicitly in nix-shell through buildPythonPackage
shellHook -> setuptoolsShellHook
Use setup.py build_ext --inplace in mkShell shellHook
Stop trying to set version: it's read-only
First stab at Nix version of .travis.yml
Hack around Nix install failure on Travis
Make home directory OS-independent in .travis.yml
File Changes
M .travis.yml <https://github.com/nextic/IC/pull/727/files#diff-354f30a63fb0907d4ad57269548329e3> (23)
M conftest.py <https://github.com/nextic/IC/pull/727/files#diff-c9b030b0828c1291b56fc10aba6baab6> (4)
A multi-python-version-shell.nix <https://github.com/nextic/IC/pull/727/files#diff-39563724f875438c2157f9e6b140db12> (61)
A nixpkgs_version.nix <https://github.com/nextic/IC/pull/727/files#diff-704d3b58c5739d51d06dbe121ad2ac17> (13)
A shell.nix <https://github.com/nextic/IC/pull/727/files#diff-5204cbe23b218cae9de0b1aedd4edcd2> (93)
Patch Links:
https://github.com/nextic/IC/pull/727.patch <https://github.com/nextic/IC/pull/727.patch>
https://github.com/nextic/IC/pull/727.diff <https://github.com/nextic/IC/pull/727.diff>
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#727>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB5SID4DNFXHDPYZDRQXHCTRXYMT5ANCNFSM4OD4NRWQ>.
|
Do you know if this works neatly in computer grids?
Does travis like that? |
What exactly? Do you mean the automagical enviroment selection? If you manage to get nix on there somehow, I struggle to see what could stop the automagic from working ...but ... well ... There are more things in heaven, earth and (above all) computer systems, Horatio, Than are dreamt of in my philosophy, so something might throw a spanner in the works. Besides, the automagic is very convenient for developers, but in a production environment we'd probably use an explicit Or do you mean getting Nix onto your grid machines in the first place? That I don't know.
|
Yes, I meant that
Good. These would be a major concern otherwise.
Yeah, this is another concern, because I'm not sure we can ask nix to be installed in certain places. |
Did you get this error?
this is while doing |
Yes, that's what I'm getting on Travis. On my machine I'm actually running NixOS as my main OS, and there In principle we should be able to get reproduceable builds, so that we get identical behaviour across machines, but I haven't bothered pinning any versions yet, so that might explain the difference. I'll try to reproduce that error locally. |
I've pinned the version of nixpkgs, and it now installs and runs the tests on Travis. You can see the test failures here. After a fetch of the latest commit on the branch, you should be able to reproduce the same results on your machines ... although I apparently can't (presumably because I'm running NixOS rather than just the Nix package manager on some other OS, and somehow the configurations aren't equivalent): while Travis fails 38 tests, my machine fails 227 tests. |
38 failed + 2 errored. |
I've rebased on top of #723 and all the tests pass on Travis.
I'd prefer to start with a clean slate, and try to explore different approaches to specific package pinning from there, rather than starting off with some random pinning scheme just to get the tests to pass with a random set of outdated package versions. What is holding up #723? I propose that we continue this work on top of #723, even if it isn't merged. |
Some tests fail due to a change in behavior in |
On macOS Travis generates the same 218 + 81 test failures + errors as I now get on my local NixOS. Can someone on a Mac reproduce these? |
Any idea of timescale? Hours, days, months? |
somewhere between hours and days |
I just remembered that the OSX build has been failing since we migrated to LFS. Can this be it? |
And we've just left it like that?
Looking at an OSX Travis log, the errors do look very similar, though the huge volume of output makes comparison difficult. Most of the test failures seem to be related to databases or HDF files, which seems to agree with the LFS-failure-is-responsible hypothesis. I presume there are still people developing IC on macOS ... so I presume that someone can quickly confirm that the standard IC tests have been passing on macOS development machines since we started using Git LFS. Please ... pretty please! Could someone with Mac for whom the IC tests pass and who is aware of successfully using LFS please check what this branch does on an LFS-capable development Mac? This amounts to
So that's no more than 2 minutes of typing, if you type veery slowly, and then getting on with your life while waiting for the tests to run (and waiting for the nix installation to complete, which will interrupt your typing, but I don't recall that it takes terribly long). |
Let's draw their attention |
Yes, I can |
Where can I see what LFS artefacts I'm expected to have, if everything is working ok? I've logged into the LFS server I'm not having much luck finding anything there. |
I don't understand what you mean |
I have tried the first line but nothing is downloaded. I have read on the Internet that |
A successful clone or checkout of IC should result in some files in the tree being there because Git LFS put them there. I would like to have some idea of what those files are. But don't worry, I have found solid proof that something is wrong with my Git LFS setup: find . -name "*sqlite*" -exec file {} \;
./invisible_cities/database/localdb.NEXT100DB.sqlite3: ASCII text
./invisible_cities/database/localdb.DEMOPPDB.sqlite3: ASCII text
./invisible_cities/database/localdb.NEWDB.sqlite3: ASCII text
$ find . -name "*sqlite*" -exec cat {} \;
version https://git-lfs.github.com/spec/v1
oid sha256:dc08bc015f6dff0d5103957527dd95974178fa110090d68005b3b9abfce3ac12
size 34603008
version https://git-lfs.github.com/spec/v1
oid sha256:543c744f37ad6c61fefea492dfb7f99117bd74b7cfab2f706cf34283de714ef1
size 5324800
version https://git-lfs.github.com/spec/v1
oid sha256:05dca8bbd4a17c3bf4118ec9ccd4dacf559352085abf55409ccf74162946af29
size 382033920 |
Hmm. What exact output do you get? How about sh <(curl https://nixos.org/nix/install) --darwin-use-unencrypted-nix-store-volume ? |
With the first one I have the output:
And with the second one:
|
Right, so the first one did download stuff, and even gave you the instructions for what to do on Catalina (i.e. the The zeros on the second one are a bit suspicious, but maybe there's some caching going on somewhere, after all, you are downloading, once again, exactly the same file as you downloaded earlier. What happens if you try the rest of the instructions ... i.e. continue with |
Yes, I tried it too just in case but:
@mmkekic gave me this link https://nixos.org/nix/manual/#sect-macos-installation |
Yes, there are problems on Catalina, and I have one more idea: usually these nix installation instructions show a line of code that includes some variation on the theme of downloading the installer and piping it to
in other words, exactly what you did the last time, with |
Do I have to delete something before?
|
The differences in behaviour were, indeed, down to git-lfs. I have activated git-lfs on my machine and on Travis OSX and we now get essentially identical behaviour on
(We still don't have a reliable way to install nix on Catalina ... I'm looking into it.) The test output is currently very noisy. Does anyone have any obvious ideas how to silence this? |
This reverts commit 632f281.
b8bc830
to
eac1a09
Compare
8826203
to
afb8b4c
Compare
afb8b4c
to
70c4bcb
Compare
Probably an outdated question, but have you considered using docker instead? |
Can't speak for anyone else, but I have. I much prefer Nix. It's been a while since I've touched Docker, so I don't have an eloquent explanation of the many reasons for this on the tip of my tongue any more. As a pragmatic point, you are unlikely to get administrators of HPC systems to install Docker, because of the massive security implications. Nix can be installed without admin rights, though that approach doesn't work universally and is sub-optimal. |
Following this comment and re-reading what's been going on in this PR, it seems that there was an attempt to try nix on different machines, but I guess not on all of them. Did it work in the ones we've tried? Do we have any showstoppers? Do we understand any of the issues that can be dealt with? Also, I recall from a different conversation that there was going to be a major version change in nix (or something like that) that would make the client's life much much easier. Are we there yet? |
I think that there are 2 (and a bit) main obstacles:
In slightly more detail:
We also tried to install Nix in Singularity as that does tend to be available on many HPC systems. We didn't succeed, but I forget what the problems were. @jmbenlloch did the work. Along theses lines, Nix has strong support for building Docker containers from Nix specifications. It also has (not as mature and almost completely undocumented, last time I looked) similar tools for producing Singularity containers. This might be a better alternative on such systems.
You are probably referring to Nix Flakes.
A new version of Nix (2.4) was released about 2 weeks ago. IIUC (I've had zero time to look into it), it contains enabled Flakes by default, so you don't have to fiddle around with switching on experimental features (or maybe even installing a pre-release version of Nix) in order to be able to use Flakes. The new nix version also includes a much more feature-complete version of the new Nix CLI interface. Both Flakes and the new CLI are almost certainly the future of Nix. But Flakes are still marked as experimental, even though they haven't changed significantly for something like 2 years. In brief, are we there yet? We seem to be approaching there asymptotically. |
Edit: the original text that appeared here is not appropriate for use in the
merge commit message (as our practice dictates). Here is what the merge commit
should say; the original is preserved, lower down.
Merge commit text
Replace the fragile, error-prone, user-unfriendly and high-maintenance
manage.sh
/Conda approach to installing dependencies and managing theirversions, with Nix and direnv.
The disadvantage of the new system is that the user must ensure that Nix and
direnv are installed: this cannot be automated to the extent that installation
of conda was automated in the old system. I have tried to provide tools and
documentation that streamline this process as much as possible, in the
doc/nix
directory.
However, this needs to be done only once on any machine, thereafter the huge
advantages are:
Simply
cd
ing into the IC directory ensures that all the dependencies areinstalled and made available.
Checking out a different commit, automatically ensures switching to the
corresponding versions of the dependencies, if necessary.
Nix provides a far greater set of packages than Conda. Consequently we can
provide--for example--debugging, profiling, benchmarking, visualization,
development, etc. tools (perhaps only on specific branches where they are
useful) without the user having to make any effort to install them.
Nix is far more robust, and can provide stronger guarantees about various
packages working together correctly, than Conda can.
For most IC contributors, the first two points should be the most visible in
day-to-day work.
Original text:
Why?
Our
manage.sh
environment management system is fragile, error-prone and rather user-unfriendly.I would like to explore the possibility of replacing it with Nix.
There is much to say about Nix (lots of it good, some not so good), but from a very high-level perspective the most important points are that, if we can get this to work properly:
Simply
cd
ing into the IC directory would automatically initialize the appropriate development and execution environment.Checking out a different commit should automatically update the environment to the one matching the checked out code.
The major downsides are:
Nix assumes that you have root privileges, in order to install it. (But once you have installed Nix, using it to install packages does not require admin rights.) Installing Nix without root privileges requires some extra fiddling: We have to check that we can install Nix on any machines we might want to use for production.
Nix is a VERY complex beast (but if we get it right, most of you will not see any of that complexity).
In summary ...
Setting this up properly will take consiberable effort, but once that is done, the result should be very pleasant indeed for anyone working in IC.
Please help
In my initial attempts, lots of tests are failing. Some of this is likely to be because the package versions specified in
manage.sh
are ancient, while I'm using up-to-date ones in the new config. As I have been out of touch with IC for quite some time, most of these failures are completely meaningless to me, so I'd appreciate if someone who is more in touch could have a look to see if there are any obvious solutions to any of the problems.How can I help?
Install Nix on your machine. The process is described here. It's approximately:
sudo curl -L https://nixos.org/nix/install | sh
. $HOME/.nix-profile/etc/profile.d/nix.sh
to your shell configuration.cd /path/to/IC
nix-shell
pytest
See if you can understand any of the test failures.
Looking forward
If we can get this to work, I would propose to maintain this branch alongside
master
to see how it deals with evolution of requirements, pinning of versions etc. for quite some time. If, after a while, it proves to work reliably, we could use it to replace themanage.sh
abomination.