-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Add IPFS to Nix #1167
RFC: Add IPFS to Nix #1167
Conversation
adds support to 'cat' and 'add' .nar files to IPFS. If IPFS should be used to fetch .nar files without using the API interface, a gateway can be used aswell. Adding file through a gateway is not possible. Signed-off-by: Maximilian Güntner <[email protected]>
Signed-off-by: Maximilian Güntner <[email protected]>
Is the .nar file compressed? If not, you might want to enable the rabin-fingerprint chunker when writing the .nar files. This will allow deduplication of identical files inside multiple archives. I don't think IPFS uses rabin by default yet. From the command-line that is done with 'ifps add --chunker rabin FILE'. Even compressed can work if you use a 'rsync-able' compression algorithm. |
@wscott : The |
@@ -4,14 +4,48 @@ libstore_NAME = libnixstore | |||
|
|||
libstore_DIR := $(d) | |||
|
|||
libstore_SOURCES := $(wildcard $(d)/*.cc) | |||
libstore_SOURCES := \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is wildcard not used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the additional IPFS sources are the second config dependent input for libstore I wanted to make a clean solution instead of following s3-binary-cache-store.cc
(being the first one) with adding a lot of #if ENABLE
to the source files. That way only sources are compiled and linked that the config requires. Makes the build processes a bit cleaner and easier to debug.
Signed-off-by: Maximilian Güntner <[email protected]>
Signed-off-by: Maximilian Güntner <[email protected]>
Publishing to IPFS is now optional (default
|
Added for future reference.The std::string ValidPathInfo::fingerprint() const
{
if (narSize == 0 || !narHash)
throw Error(format("cannot calculate fingerprint of path ‘%s’ because its size/hash is not known")
% path);
return
"1;" + path + ";"
+ printHashType(narHash.type) + ":" + printHash32(narHash) + ";"
+ std::to_string(narSize) + ";"
+ concatStringsSep(",", references);
} [1] So the IPFSHash is signed indirectly through the [1] From: https://github.com/NixOS/nix/blob/master/src/libstore/store-api.cc#L523 |
Can you write up a quick summary of what IPFS is and why I should care? |
@shlevy I cannot speak for @mguentner, but @vcunat provide some motivation for IPFS |
I haven't been carefully tracking IPFS threads, but since nobody else answered I will pile my thoughts here, and people will correct me. As far as I understand IPFS can make a global storage for NAR's so people can choose to host their builds or cache builds from others. This can potentially unload some bandwidth load from the Hydra S3. Some enthusiasts could build and share things that are not currently being build by Hydra like python packages. This is already possible now, but it requires doing two things:
IPFS could eliminate the first step since the namespace now becomes global and NAR's could probably be discovered through IPLD. The current implementation doesn't do that because it requires IPFS address to be served with nar-info. But the distributed NAR hosting should work already. There was also a discussion about implementing file or chunk deduplication over IPFS which could have potential for reducing sizes of things. Is this supposed to happen for download size or the size on the disk? I don't know. Anything written above might be wrong. I don't claim that I possess excessive knowledge on the topic discussed. Please, don't get angry :) |
I wrote an article which explains why IPFS could be useful for NixOS. Also I think that you are quite right @veprbl 👍 |
Cool, thanks! Awesome idea. |
else if (cmd == "cat_gw") | ||
return "/ipfs/" + arg; | ||
else | ||
throw "No such command"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we catch strings anywhere, so this should be throw Error("...")
.
} | ||
ipfsHash = IPFSAccessor::addFile(narPath, *narCompressed); | ||
|
||
if (!ipfsHash.empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would ipfsHash
be empty here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In any case, it shouldn't -- an IpfsNode
should emit the hash + name to the client when the process completes without an err.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The file is uploaded through the HTTP API and a lot can go wrong there.
As for the cpp part, this is the relevant code:
https://github.com/vasild/cpp-ipfs-api/blob/master/src/client.cc#L164
I have tested the code in this PR with more paths after posting this RFC and quite some requests failed silently as the function is void
and does not raise
anything. As this is unacceptable, the next iteration of the implementation needs to include error handling when adding files if this feature will be even included.
Reason:
My research into IPFS revealed that one needs to pay attention to a lot of things (it's not ftp after all). These include trivial things like selecting a chunker
and rather complex tasks like collecting garbage after n ipfs add
s while not throwing away unpinned content (race condition).
From a design perspective, adding nars/nix store paths to IPFS must be handles by a separate tool as this is to much complexity to go into Nix (following Ken Thompson's philosophy here).
(Also part of the reason for #1167 (comment))
@@ -41,6 +41,8 @@ NarInfo::NarInfo(const Store & store, const std::string & s, const std::string & | |||
compression = value; | |||
else if (name == "FileHash") | |||
fileHash = parseHashField(value); | |||
else if (name == "IPFSHash") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we call this IpfsNarHash
or something for future comparability with computing Nars on the fly from some nicer format? Or am I jumping the gun for a file called nar-info.cc
after all :).
@@ -290,8 +310,23 @@ bool BinaryCacheStore::isValidPathUncached(const Path & storePath) | |||
void BinaryCacheStore::narFromPath(const Path & storePath, Sink & sink) | |||
{ | |||
auto info = queryPathInfo(storePath).cast<const NarInfo>(); | |||
std::shared_ptr<std::string> nar; | |||
|
|||
#if ENABLE_IPFS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As commented, downloading from IPFS can open the door for new attacks. I will recommend adding a download-from-ipfs
flag as well which is disabled by default.
Thank you for the reviews / comments. I am currently rewriting the implementation as injecting IPFS directly into Stay tuned. |
@mguentner How would that work, and how would it handle narinfo files? The present approach seems reasonable to me. It just needs a flag to enable/disable IPFS (e.g. as part of the store URI, like Another possibility: rather than add an
This would have the advantage of not putting transport-specific info in the NarInfo data structure. |
@mguentner I like that proposed code structure. From just how you described it (I haven't read the interface your implementing), I think having two separate implementations segues into a narless world nicely. |
The binary cache implementation is finished: Have a look at the code - there are some small TODOs in there but
just works 🎉 and can me merged soonish. If you want to test it (easy: nixops VirtualBox config), have a look at Currently there is only |
@edolstra Addressing your question: I like your idea of adding multiple |
Forget the above. I guess I get it... |
@mguentner So what is the status of this? Why was it closed? |
Nothing changed much since #1167 (comment) Again, you can try the current status using this nixops config: (Once the machines are setup, do |
@mguentner Is there/will there be a way to partially mirror a channel? I have machines, but not enough storage for a complete channel... |
You will be able to run a local gateway that serves content from the binary cache and then caches/redistributes the content until it is garbage collected (LRU) depending on how much storage you allocate for this. If you want to warm your cache with a partial channel, you need to write a script/nixos test that requests all the hashes you are interested in storing. |
This adds IPFS support to Nix. 🚀
It adds
.nar
files to IPFS aswell, and writes the resulting hash to the.narinfo
and signs the results.When the
.narinfo
is accessed, the.nar
can be fetched from IPFS instead of using theHTTP
method.Please have a look at a
nixops
hydra setup where this is explained in more detail.https://github.com/mguentner/nix-ipfs
This is a proof of concept. More code will follow once the design is approved and finished.
Ref: #859
Ref: ipfs/notes#51