-
-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LittleFS file caching #18
Comments
investigating |
So I'm working on branch: https://github.com/joltwallet/esp_littlefs/tree/multi-fd-sync I added your code snippet as a unit test, and there is some non-determinism going on. It usually works the first time, but fails the second time. Trying to figure out if its a problem in my port, or if its an upstream issue. |
Interesting, is it using a cache per file handle or is it per file in the FS? If it is per handle, is it possibly the caches are getting out of sync? |
so within my port, the function |
operating under the assumption that upstream littlefs cannot handle the same file being open multiple times (waiting to hear back from their authors), I can add this functionality to my port relatively easily. However, in the easy way to implement this, opening the same file multiple times would return the fame small-int-value file descriptor. Would this be acceptable? For example:
|
Perhaps track the I'd suggest tracking N usages so your VFS adapter holds onto the |
that's exactly how I was going to implement it 👍 . I can work on that more this evening. It seems simpler to have a count in the |
Once you have something ready to test I can test it in my application and see how it performs. If you have a CAN-USB adapter for your PC and a CAN transceiver you can test it locally yourself possibly as well. |
@atanisoft I implemented (and merged) the lazy solution where the same small-integer FD is returned. It required significantly less code change, and doing it the "proper" way may have some performance impact on people that really push the file system. Let me know if this solution works for you (it passes the unit test). Thanks! See: |
I put one comment on #19, but it won't impact my testing of the fix. I'll do some testing in the full setup and confirm there are no issues. |
This appears to have fixed the caching issue I was seeing, more work will be needed in the dependency which is opening the same file multiple times so that it can be consolidated to a single file handle being used for all flows but that will take more work. |
I can't see your comment on the PR, are you sure you submitted it? Also, after sleeping on it, I realized that this fix doesn't totally solve your issue, unless your usecase is exactly reflected in that snippet. With the current solution, each individual file descriptor doesn't have its own position in the file. Thinking of a way around this... |
Yeah, I was thinking of the same and was considering commenting on it as well.
Somehow it got stuck in a Pending state... |
alright, I have a greater understanding of the upstream lfs codebase now, and I also realized my test would pass the first time due to artifacts left behind from a previous failed test. Now that I know nothing magical is going on behind the scenes (like the lfs file struct magically getting updated), i can tackle this a bit better. |
Alright here is my new attack plan:
With this new setup, every
How does this plan sound? Possible downsides:
|
I think overall it sounds good but do add some locking of some sort to ensure no two threads are directly modifying the |
all the above operations would be done while the filesystem mutex is obtained, so there should be no issues. |
@X-Ryl669 do you have any thoughts on this? You have a keen eye for these sorts of things. |
Honestly, this is a lot of change for a barely invalid usage pattern (so it adds a lot potential bugs and multitask issues).
Please notice that the file might not be opened with the same attributes (for example, once in read mode and later in read/write mode). So unless it's fixed upstream, there is no valid solution that's not a hack like the proposed implementation above, IMHO. |
Note that this is a perfectly valid use case, multiple file handles on the same file from different threads. It works on other FS types (including SPIFFS). But you are right, the multiple file handles would potentially have different mode flags which would need to be accounted for in the read/write implementations. |
I never had the need for this. Anyway, I don't think the HAL should be the right layer to implement this if the underlying driver does not, since in all cases, it'll lead to bugs that are almost impossible to solve here (I'm not even talking about partial write, truncate and all the other oddities) |
Yeah, this would definitely be considered a corner case.
I suspect the underlying FS code does support this but it may be dependent on using the same
If that is how this is implemented, LittleFS will no longer be an option for a number of projects I know of. If there was a way to disable or reduce the cache size that may be a better option than throwing an error if you attempt to open the same file in multiple code paths. |
No, this is not possible, since the open option can be different, and the current position in the file will have to be different, and so on (think about ftruncate for example).
I don't know about your project. I understand how simpler it seems not to care about the file path when running open. Yet, this is a wormhole hiding in this simplicity, and I really doubt SPIFFS is acting correctly in that case (and for sure FatFS does not, even if it seems to work for dumb test case). In your example, it seems to work but it'll break as soon as the operation happens in different task unless you have a big lock/mutex for all your FS operations, but it's not how ESP-IDF is implemented since it would really hurt performance. This is because underlying write to the flash is not atomic once it touches multiple pages. So you might have situation like Task X writes "A..B..C" and Task Y will read "A.." or "A..B" or "A..B..C" or the unexpected "A.. ..C" since the writing the B page is not finished yet. What happens if one task seeks to B and the other task truncate to A ? Honestly, I don't know about your software, but if it relies on atomic operation on the FS, I don't think it'll work correctly anyway on ESP-IDF, whatever the FS. |
these are good points. Alright I think my action plan on this (for now) is to:
|
The issue of atomicity is not a problem in the code in using and is working correctly on SPIFFS and FatFS (SD) as well as on other platforms (Linux mainly). I'll take a look at PR 21, as long as it allows the file to be opened multiple times it should be fine. It can be on the consumer code to ensure atomicity is preserved across tasks if/when needed. |
Hello, I left some comments in the littlefs thread (littlefs-project/littlefs#483 (comment)). TLDR: littlefs files are transactional in the sense that each open file gets an independent snapshot of the file's contents. What is the use case for opening the same file multiple times? Is it primarily a usability thing to keep track of multiple file positions? |
@geky it's a small bit of a bug in a library I'm using but it works on all FS types except LittleFS from what I've tested so far. It would seem there should be a way to flush the caches even across multiple file handles so that one handle being written to can be read by another after a flush operation. |
So its hard to determine what the best path forward here is. There are multiple possible solutions, but they're all sort of hacky/weird. |
Wait no I misread that. I wonder if this is actually an undefined area for FatFS and you risk data corruption if the two file descriptor's caches overlap each other. |
It's possible in when using flash for storage but a bit unlikely on SD storage. |
Hello @BrianPugh , I got an error compiling your multi-fd branch:
what a pitty that's not documented..
Thanks |
So i abandoned that effort; it was too big of a hack that didn't actually really solve the fundamental problem. This littlefs driver is threadsafe, but you cannot safely write and read from the same file using multiple file descriptors. |
ok, so if I well understand the master branch is threadsafe when you are accessing to multiple different files but it's not safe when you are accessing to the same file from 2 different cores at the same time. |
Hello @BrianPugh I was reading the esp_littlefs code for several hours, I am looking for some solution to avoid crashes when writing and reading the same file simultaneously. Thanks |
It's an upstream littlefs limitation. See this upstream PR draft: I should just delete that branch, it was an attempt at a workaround, but it ended up just making a mess and not really solving the underlying problem |
Meanwhile, in this specific usage case, you can also do this:
This requires storing the file's path (so no This scheme will obviously not work if you use |
@X-Ryl669 at first look your proposal sounds reasonable taking in mind that the MCUs will never have simultaneously too many files reading and writing. |
@johnnytolengo can you use a circular buffer in memory instead? Or possibly cycle through a few log files (log_0.txt, log_1.txt, etc) For a lot of filesystems (I'm pretty sure) the action you are describing is undefined. |
Append only mode is where littlefs is performing less efficiently. In your situation, I would either:
You could have a single file descriptor either too (and still use a mutex) so when a read is requested, ftell the current write position, fseek to the part to read, read and seek back to the write position. On ESP32, we have plenty of memory, you can spare 64kb for the live logs, and as Brian says, swap the live buffer once it's full by writing all at once. |
what a pity that littlefs is not file-descriptor safe :) Seems that's a simple thing but wide complicated than anything else.. |
just to be clear, this isn't an issue with the filesystem being threadsafe. The issue is having multiple independent file descriptors open at the same time pointing at the same file where at least one of them is writing. |
well.. that's your point of view, in fact if you are running on different threads the filesystem is partial-safe, sadly the chances to corrupt it are very high. |
Could you be a bit more precise on how you think a corruption occurs? You can have multiple file descriptors in multiple threads, all doing writes, and everything will be fine so long as they aren't pointing to the same filename. |
you are right, just with multiple file descriptors :) but we have to take care in case the same fd is used at the same time. |
You can share the same file descriptor across threads and it will work. Once again, the issue is multiple file descriptors pointing to the same file where at least one of them is writing. |
mmm.. I never thought about that possibility but I have not clue how to use the same file descriptor in multiple threads. |
The code that's creating the tasks should open the descriptor and close it. So you don't care about sharing here, since it's not the task responsibility to do that, and you'll be safe. You'll have something like:
Or you can use a mutex for the file operations:
In that case, the mutex protects the |
impressive! thank you for such an example |
I am having the same issue with the log use case, one task writes to the log and another should be able to read and present it, this means that the log file is opened for append in one task and open for read in the other task to read its content. I cannot close the writer and need the reader to be able to read. |
That's a bit of necromancy going on here. To sum up what's written above:
In the "write to log and read it" case, that seems to bother here, the easiest solution is to protect the log access with a mutex so that either write or read can happen at the same time. In the read thread, take the mutex, open the file, read it, close it, release the mutex. In the write thread, take the mutex upon writing and syncing, then release it. Another solution that's working a bit better is to delegate the write and read operation to only one thread and post an event to read or write to its main event loop. This thread will react upon a read event by rewinding the read position to the buffer size to capture and read it, post the buffer to the task that has requested the read. The write event will fetch the current log buffer and write it to the file. Only a single file descriptor is required in that case. |
Thank you for explaining this, yes, special custom workarounds may be applied. The "atomic" warranty is the unexpected behavior in file system based set, these limitation (or differences from standard filesystem) need to be communicated so that people understand they need to create these workarounds, for example now I understand that a log file must be periodically closed in order for it to survive power loss, as the entire log will be lost. |
@alonbl A possible (simple) solution is to call |
Hey guys, thanks for posting work arounds here. However, if you want this to be properly solved, maybe voice your thoughts to geky at littlefs-project/littlefs#513 |
Hi @atanisoft, I do not see flush(fd) available
I also expect that if this works, then the file content is committed and should be also visible to other Have you checked your solution and found it working? can you please share a complete example? Thanks, |
You need to close the read's fd (if it's in its own thread) while you hold the mutex because the metadata it refers to will be wrong as soon as the write fd modifies the file. Once you've closed it, the only way to read is to reopen so it forces fetching the updated metadata. On the write thread, you don't have to close it, you can sync it (with fflush) but this must be done with the mutex taken (so the sequence is hold the mutex, write data, sync, release mutex). |
Yes, I've used it and it does work. I use the VFS approach to In my original use case there were two file handles being created for the same underlying file, this was a bug in the library that I was using as it should be using a single file handle. There were different behaviors observed between the two file handles, much like @X-Ryl669 indicates above. After fixing the library to uniformly use a single file handle and implementing flushing to storage the issues were largely gone. The critical piece that needs to be handle is ensure you have locking to prevent concurrent read / write operations if the latest written data is interesting to the reader and either reopen the read file handle or use a single file handle across both read and write (which increases the need for locking). |
I've encountered an odd case where the caching in LittleFS seems to be causing an issue that I'm not seeing on SD or SPIFFS.
Using a code pattern similar to below the second file handle will not see the data from the first file handle update:
Using two handles for the same file doesn't make a ton of sense in most cases, especially within the same function, but this pattern works on other FS types.
Any ideas or suggestions on config tweaks to reduce the chances of the two file handles being out of sync?
The text was updated successfully, but these errors were encountered: