-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
File disappears after going to $INCOMING_DIR #410
Comments
@danfruehauf if you're bored in NZ ! |
@lbesnard Would you expect that file to be published? I.e. does it pass the checker and does the dest_path get correctly determined? Do you get the same behaviour if you do this in the PO box? It's not surprising that the file isn't in $INCOMING_DIR because it is moved to a temp directory before the incoming_handler is even called (see https://github.com/aodn/chef/blob/master/cookbooks/imos_po/templates/default/watch-exec-wrapper.sh.erb). As you can see the error in the log is referring to the file in |
yeap, and it works fine on nsp14.
yeap, that's the all point. Where does this file go then ? |
I think the problem is running it as I just tried running your first example above and it worked fine:
|
hum, that's bizarre. How about the crons ? because they all run as |
Well I think we (I) definitely need support if it works with a normal user but not PO |
Also another similar bug, but not as problematic maybe. Happens on the PO box.
but the file |
I won't weight my words, it's just disastrous and happens to easily. Any small issue in an incoming handler would result to the same thing. This bug should really be seen as a priority, we can not guarantee we're not loosing any data. |
Could someone point me at the documentation? |
Just saying but this looks to me as a very important bug. It wouldn't be a bad idea to spend a little bit of time on this just to see this is actually real as we have no idea how long this 'potential bug' has been on, and how many files have already completely disappeared from the FS. |
Yes @lbesnard, I'm on it as of this arvo. Replicating and checking out the chef code atm |
👍 @anguss00 ! |
Let's hustle this bug :-) |
in process.7.log.gz
tried to reprocess same file as PO user
|
OK, we have a suspicion that the size of /tmp is not sufficient at just over 8GB. If this fills up, we can't add any more files, and its adios muchachos. Suggest to increase the max size of this directory. Investigating system logs further. |
Are you sure |
List of recent not valid files (some are correctly identified).
|
👍 @julian1 glad to see the problem is actually really serious |
@lbesnard stop working! |
Is that the real @atkinsn on github? |
@julian1 an imposter.. |
0758315 was only a possible fix |
Some files completely disappear from the system after going to $INCOMING_DIR
How to reproduce
on aws10, connect as PO user, and do a
source ./env
at the base of data-services dir (would be good to add this automatically via chef btw)Ok why not, the file is 'not valid'. Well the original one is. But it looks like there is an issue here https://github.com/aodn/data-services/blob/master/lib/common/util.sh#L165 with the tmp file. The definition of 'valid' is a bash check to see if the file exists or not. So we're really at a low level check.
The issue is that we should still be able to locate the 'bad' file. But it is nor in $ERROR_DIR nor in $INCOMING_DIR. Anyway, this is EXTREMELY concerning. Of course, Nagios won't warn us on anything because there is no files in the ERROR_DIR. So we have potentially lost many files.
Both these folders are empty, as well as the thredds folder where the file should go to.
I wish I was wrong, I've tried and checked many times.
@smancini @jonescc @pblain
The text was updated successfully, but these errors were encountered: