Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wip for #196 but also fixing #198 #201

Open
wants to merge 11 commits into
base: development
Choose a base branch
from
Open

wip for #196 but also fixing #198 #201

wants to merge 11 commits into from

Conversation

petersilva
Copy link
Contributor

close #198

I have been working on the hpc-mirorring, finding edge cases that don´t work. One such case is events to reflect directory removals. I noticed that, as per #198 the events produced by the C for rmdir events (or rm -rf) were not being produced with the correct fileOp field:

  • "fileOp" : { "remove":"" },

when it should have:

  • "fileOp" : { "remove":"", "directory":"" }

on the Python side, you need the both attributes when trying to mirror an rmdir.

So that was one problem. Another problem was that when doing something like "rm -rf" some of the remove events are for directories. The way to create a file removal event with the current API is to provide a NULL pointer to the file status struct. The file status struct normally provides the file type (file, directory link, etc...)
since the removal event doesn't have this information, the API needs to be changed to allow communicating
that the file being removed is a directory. so added rmflags to a whole slew of routines (two or three levels deep.)


   const int rmflags 
   * 0 -- sb MUST not be null, this is not a remove event.
   * 1 -- sb==NULL this is a file (non-directory) removal.
   * 2 -- sb==NULL this is a directory removal.

So that's the logic for #198... Unfortunately I was working on that in a branch that was originally about #196. I tried cherry-picking off the above to just PR cleanly for #198, but it turned out these patches depend on the earlier ones... so I needed to PR the whole thing.

The rest of these patches are cleanups that try to make the code cleaner, while testing thoughts about #196.
None of them fix it.

My guess as to the real problem is for #196: when we add process cleanup routine, it gets added after posix thread cleanup has occurred. when we try to close a file, it calls posix thread code where those data structures have already been cleaned up.

So what are these patches?

  • 5fc86ae segfaults were happenning when calling hash functions, so got rid of some times when calling those routinges is useless. but since we aren't calling the routines this could change results... but when I ran the flow tests... it seemed fine...
  • 1719b39 - reference through un-initialized pointer... looked wrong.
  • 3477381 -- SSL hash initialization routines... we were using a very old API, (and it was crashing in it.) so I changed it to use a newer API version. It didn't change anything though.
  • 688edcb -- SR_SHIMDEBUG log outputs weren't working because any time they were set, it would suppress messages... (a test was inverted.)

adding skip of identity generation for hard links.
adding skip of identity generation for non-files.
still crashes but _create/ replaced by _new & _free (was not cleaning
up before.) It still crashes in init_ex for file removal... odd.
there was a global variable rmdir_in_progress, but it wasn't
catching all events. Decided on an API change instead.
now the sr_post related event processing requires an rmflags parameter.
@petersilva
Copy link
Contributor Author

This branch had been started ... oh a month ago... so the conflicts were there based on other merged patches... so I merged to it to get rid of the conflicts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

rm -rf does not tag directories during removal...
1 participant