-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support for gocryptfs, ecryptfs, borg, duplicity, and rsync #374
base: master
Are you sure you want to change the base?
Conversation
…tiple file descriptors, as well as support for getting and setting xattrs on files.
Addresses issue: |
not bad, would be nice to have a configurable lru style cache to help with reads/writes (with some kind of read ahead) |
@yadayada looks like the buildbot needs a new oauth token to test properly. I see this in the logs: |
I've implemented proper mtime handling in one of the xattrs so that rsync over acd_cli can work as expected. This addresses: |
Why not "backport" the write buffer feature as a general write-back cache for acd_cli? That'd fix problems with ecryptfs, encfs and any other applications where data is appended in small blocks (and overloads the acd_cli API eating 100% cpu). |
…ructs leads to epsilon problems, causing rsync to think that mtime is different when it isn't.
… using xattrs for crypto headers, we have to allow re-writing the first bytes of a file to make ecryptfs happy. once they fix their bug, this can be removed and we can go back to xattrs.
Turns out that ecryptfs has a subtle bug when it stores its crypto headers in xattrs; it reports file size incorrectly on the next time it's mounted: That means rsync will behave properly only if your mount has perfect uptime! :-) Until they fix that, I've allowed the acd fuse mount to overwrite the first few bytes of a file where the crypto header would go. Because we still need to write to amazon sequentially, I'm solving this by storing the header in xattr space, and splicing it back into the byte stream on read. This still seems better than requiring whole files to be kept in memory until fully written. |
…there are some rsync flags that write multiple times to the same memory location, for reasons unknown. this keeps the whole file in a buffer until it's flushed to amazon on file handle closed. future work will be for super large files, we should use a temp file as backing.
…d first and filling in the gaps
…rtial) buffer in the middle of writing is hard enough that we bail on it. We don't care about pre-allocating files since we have infinite space, and shortening a file is only possible when it's being written to.... so we can only catch the rare use case of file overwrites and truncate back. Neither are worth it.
I've finally gotten rsync, ecryptfs, and acd_fuse playing nice together. There were enough corner cases around rsync flags I can't control (thanks Synology!) and some older versions of the kernel that make ecryptfs call useless truncates before flushing (thanks Synology!) that the best way to make it all go is to build a write buffer in memory until all the interested file handles are closed. This allows multiple writes to the same offset, out of order writes as long as nothing leaps forward with a gap, and eliminates the hack of putting encrypted headers into xattr space. Further work will be to use temp file backing rather than memory backing if individual files get too large. |
@bgemmill this is covered in #314 and is not ecryptfs specific. It'd help with performance and other apps which write file handles non-linearly. It'd be awesome if you could port the write buffer feature as separate PR (separate flag/option) which this one can depend on. hint: https://github.com/redbo/cloudfuse/blob/master/cloudfuse.c#L256-L289 |
@Thinkscape I'm only going to pursue the file backing if the write memory backing is too memory intensive. At the moment this PR makes both ecryptfs and rsync work properly, uses memory for only the files being written at any given moment, and that seems like a good place to leave it. The way I'm looking at it is that this PR is the one that the file backing PR should depend on. File caching is going to require a bit of thought too, because unless we're smart about LRU like @jrwr pointed out, we'd end up doubling the on-disk space in the process of rsyncing to Amazon. |
LRU cache is something different to what I meant. The caching Swift FUSE does is per file handle - a process opens a file handle for writing, writes as much or little as it likes and closes the handle. That's what most rsync-like streamers and updaters will do. Of course memory backing will be too memory intensive. If you attempt to rsync or random-write a 8GB file, it'll gladly consume 8+ GB of RAM. |
@Thinkscape Thanks for clarifying. @jrwr's point as I understood it was what do you do with that temporary file once you're done. Delete it immediately, keep it around for faster reading, LRU, something else? As to memory backing, I'm in the middle of going through my wedding videos, and haven't seen a huge hiccup. I'd imagine that's virtual memory doing what you suggest with swapping; I'll have more info tomorrow when my rsync job finishes. Looking at the job in the middle of today: For me, the steady state usage seems to be about ~400M for this docker image on an 8G box, and a few big files passed through since virtual is around 900M now. Caveat: this is an instantaneous measure rather than peak, and I don't know what reserved was when the big file went through. I can tell experimentally that this hasn't ground to a halt on swap or thrown python MemoryErrors. We'll see how the rest of the day goes. Once it finishes I'll look more. If you want to give it a go before then, fire up a docker container with 6G ram limit and do: |
Yeah, but why? If it buffers it in RAM, of course it'll die with a big file. |
…onfigurable) or smaller, disk otherwise.
@Thinkscape It turns out if you run that example you'd see what I did; no real performance hiccups because the docker memory clamping forces the older bits of big buffers into swap. File backing the old-school way. To make this change set more palatable to non-docker users of fuse, I put in file backing if writing gets too large. At the moment the default is 1G. On a different note, it looks like Synology's rsync directory querying fails when directories contain around 10k things; that many calls to getattr take too long for a timeout. I'm going to tackle that next since everyone probably wants 'ls -al' to complete quickly. |
Thanks. We cannot depend on any specific virtualization or OS feature to automagically manage memory for us. Rsync usually takes just a few megs of RAM regardless of the tree size or individual files' grandeur, and that's what I'd expect from a fuse driver as well. Even 1G seems excessive to me, but at least it's configurable. |
One thing I am noticing is that something in the c/p phase is very slow. I am using CouchPotato to move some stuff around and the initial upload and transfer are fast. I can see the results within ACD no problem. However it takes almost a day before CP reports that the copy process is complete. See timestamps:
Has this something to do with locking? Using encfs atm. My acd_cli.log doesn't go that far back, I will see if I can reproduce and get those logs. |
@bgemmill Thank you for the schema fix! That was throwing me for a loop, unfortunately am hitting memory issues with the update. I started seeing in the syslog-> Out of memory: Kill process 8398 (acdcli) score 850 or sacrifice child. So I doubled the ram from 4 to 8GB but continue to see acdcli use more memory than available. This is running on Ubuntu 16.04 if that makes any difference. How much RAM are you assuming for the small syncs? |
@BabyDino: I'm not familiar with CP; do you see the same behavior with rsync? @Hoppersusa: 1G on the small syncs, same as the write-back cache size. Above that it goes to the disk, or at least should according to SpooledTemporaryFile: |
@bgemmill I will test rsync. For CP, this is their source code:
The |
@bgemmill Thank you for the detail, it does not appear to be during sync calls. I don't believe it is the SpooledTemporaryFile change that is causing a problem. It appears when writes are made, the system allocates memory but the memory is not released once the write to amazon is completed. I increased the memory to 16GB and acdcli consumed all of the memory again. I rolled back to commit 26325db and could not reproduce the issue (although the performance is not as good as your current commit) but I could reproduce on the builds since. Hoped it was something on my system but have not been able to find a cause. The commit that did not run out of memory may just be because it takes longer to process the files. Is there anyway to see lower level detail than the standard acdcli debug output? |
|
||
r = self.BOReq.post(self.metadata_url + 'nodes', acc_codes=acc_codes, data=body_str) | ||
r = self.BOReq.post(self.metadata_url + 'nodes', acc_codes=acc_codes, data=body_str) | ||
if r.status_code in RETRY_CODES: continue # the fault lies not in our stars, but in amazon |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can lead theoretically to an endless loop. A max retry count could avoid that.
In conditions were the amazon server gives always a 500 back this will never end and will useless only stress the client and the server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In practice, Amazon kicks the client well before any infinite loops happen, usually with an error in the 400 range that we're deliberately not catching. I've seen errors in the family of requests/second, unable to process, and sometimes just unavailable when amazon goes down for short periods and needs to kick clients. We're not catching those on purpose!
Otherwise, for file system stability the goal here is to reduce errors we'd need to propagate upwards to the user.
If you have an example of a non-ending loop, please send me the response codes involved and I'll change the set we retry over.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The example is that the amazon server always returns a 500 for every retry. My comment was meant to change the code in a way to be prepared for the exceptional case. We all make mistakes and therefore it is possible that the loop can happen in the future. To have a limit for retry would allow handling such situations gracefully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Completely agreed there are things that could break. At the moment I'm focusing on the things that are breaking. If you have an acdcli log of an infinite loop happening in practice, please post it!
@Hoppersusa are you sure that the set of operations you're using on files ultimately releases the file handles? I'd be interested if you can repeat this error with just Is it possible you're doing multithreaded writing and have many of these write back caches open at once? |
@bgemmill At some point your branch stopped working with the "subdir" option that I use. If I use the official branch from yadayada it seems to work. If I omit the subdir option I can mount on the root. Here is the mount command that I use:
Here is how it mounts (ls -al /mnt/acd_mount_enc):
It attempts to mount it on the root of ACD rather than the subdir, and even then it isnt mounted properly. If someone can tell me how to roll back commits and install older versions of code, I could figure out which commit this occurred on. It seems like it has happened within the last two weeks. BTW, thanks for all of the great work on this. Being able to use rsync has been a great help for me. |
@ro345 thanks for pointing it out; I'm pretty sure the failure comes from the work I was doing on memory caching to speed up operations. I probably won't have time to look at this until next week or so. My money is on this commit: If you want to verify, check out my repo in git, |
@ro345 turns out that the fuse subdir module was the culprit. Let me know if that happens again. |
Thank you for the fix, really appreciate it. I probably cant try it for for a few days, but will post back if there are issues. |
Hi! I have used acd_cli from yadayada and have many files in the amazon cloud. Installation
First stepsfirst i create the oauth_data file then
after Is this a normal behavior? Errors in logmount with If i use
Is this a normal behavior? thx for your attention |
@SchnorcherSepp That error looks like acdcli can't find the root node in the cache, and you mention that Since your nodes.db may be in a funny state now, I'd recommend deleting it and calling |
# Conflicts: # acdcli/cache/schema.py # docs/contributors.rst
# Conflicts: # acdcli/cache/schema.py # docs/contributors.rst
For everyone still following this PR, we're back with a caveat: amazon deletes properties from it's records for apps that have been banned. This means mtime/uid/gid/xattrs will be gone from your files, and naive rsync calls will attempt to re-transfer everything. For future work I'm going to look at a few ways to automate setting properties on files with matching md5s. |
# Conflicts: # acdcli/acd_fuse.py
Edited summary follows since this PR thread has gotten long.
First, please direct any issues you have with this PR here so this thread doesn't get any more out of control:
https://github.com/bgemmill/acd_cli
This PR provides two primary features to allow rsyncing into a layered encrypted filesystem:
And a few caches to make the above features performant:
With those implemented, it was pretty simple to add a few other things to flesh out our filesystem support:
du
andstat
The rationale for out of order rewriting is that most encrypting file systems maintain a header around the beginning of the file that gets updated as the rest of the file is written. This means that write patterns typically look like sets of [append to end, overwrite at beginning]. I'm solving this issue by using a write-back cache that stores file writes in a SpooledTemporaryFile until all file handles are closed, and only then pushing to amazon.
The rationale for mtime is that rsync uses it for file equality testing. I'm implementing this by using one of the 10 properties an amazon app gets to store all file xattrs as a json object. Once mtime and xattrs were in place, it was straightforward to add the others.
Considerations:
Please enjoy, and let me know if anything goes wrong!
Original post
Ecryptfs has two properties that we need to overcome in order to get it working with acd_cli.
Luckily, this PR addresses both :-)
ecryptfs writes files 4096 bytes at a time, using a different file handle each time. This PR allows multiple file handles to share a write buffer if they all write sequentially. To make this performant for large files (large numbers of file descriptors), I've added some lookup caching to how nodes are obtained.
ecryptfs wants to write a cryptographic checksum at the beginning of the file once it's done. We could either buffer everything before sending, which would be memory intensive for big files, or we could have ecryptfs store this checksum in the file's xattr instead. I've opted to go this route, which required implementing xattrs over ACD using one of our allowed properties.
Additionally, ecryptfs is extremely chatty about when it decides to write to this buffer. To deal with this, xattrs are marked as dirty and only sent over the wire when any file has all of it's handles closed, or when fuse is unloaded.
With these changes, I can get about 80% of my unencrypted speed to ACD at home using an encrypted mount. If everything in this PR looks good, I have a few ideas of where to push that a bit more.
Please let me know if I grokked the fusepy threading model properly, that's the piece I was the least sure about, especially how safe/unsafe some things were with the GIL.