Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fishhook conflict with address sanitizer #47

Open
arronzhujf opened this issue Oct 25, 2017 · 5 comments
Open

fishhook conflict with address sanitizer #47

arronzhujf opened this issue Oct 25, 2017 · 5 comments

Comments

@arronzhujf
Copy link

arronzhujf commented Oct 25, 2017

I hook socket relevant c function in my project, like getaddrinfo,connect,socket and so on. When I open address sanitizer for debug I found some error like this:
image
It seems that both address sanitizer and fishhook hooked getaddrinfo() , and result in a conflict.

The following WWDC video said sanitizer hook standard c library.
reference: https://developer.apple.com/videos/play/wwdc2015/413/

@dlow-yahoo-inc
Copy link

I ran into the same problem. After some poking around, I think I understand what's causing the infinite recursion:

When building an app with "Thread Sanitizer" or "Address Sanitizer", the compiler generates a special DLL named libclang_rt.tsan_iossim_dynamic.dylib (tsan -> Thread Sanitizer) or libclang_rt.asan_iossim_dynamic.dylib (asan -> Address Sanitizer). These libraries contain wrapper functions corresponding to system APIs (eg. wrap_getaddrinfo() for getaddrinfo()).

These special DLLs are "interposed" by the linker on top of the system libraries before the app's symbols are resolved. The pseudocode for wrap_getaddrinfo() looks like:

int wrap_getaddrinfo(const char *node, const char *service,
                       const struct addrinfo *hints,
                       struct addrinfo **res)
{
  // Some instrumentation

  // Forward to original function, the linker is smart enough to bind this symbol to the next DLL
  return getaddrinfo(node, service, hints, res);
}

When Fishhook does it's rebinding, it does a massive search & replace of all references of getaddrinfo() to MAM_getaddrinfo(). Including the one in the specially generated DLL. Hence, leading to the infinite recursion.

Whatever the solution, it will involve breaking this cycle.

@dlow-yahoo-inc
Copy link

A dirty hack that seems to work is:

static void _rebind_symbols_for_image(const struct mach_header *header,
                                      intptr_t slide) {
    uint32_t c = _dyld_image_count();
    for (uint32_t i = 0; i < c; i++) {
        // HACK: Get file name of the mach header
        if (_dyld_get_image_header(i) == header) {
            const char *path = _dyld_get_image_name(i);
            const char *base = basename((char *)path);

            // Only rebind libraries that are not the special generated sanitizer ones
            if (strcmp(base, "libclang_rt.tsan_iossim_dynamic.dylib") &&
                strcmp(base, "libclang_rt.asan_iossim_dynamic.dylib"))
            {
                _rebind_symbols_for_image(_dyld_get_image_header(i), _dyld_get_image_vmaddr_slide(i));
            }
        }
    }
}
int rebind_symbols(struct rebinding rebindings[], size_t rebindings_nel) {
  int retval = prepend_rebindings(&_rebindings_head, rebindings, rebindings_nel);
  if (retval < 0) {
    return retval;
  }
  // If this was the first call, register callback for image additions (which is also invoked for
  // existing images, otherwise, just run on existing images
  if (!_rebindings_head->next) {
    _dyld_register_func_for_add_image(_rebind_symbols_for_image);
  } else {
    uint32_t c = _dyld_image_count();
    for (uint32_t i = 0; i < c; i++) {
      const char *path = _dyld_get_image_name(i);
      const char *base = basename((char *)path);

      // Only rebind libraries that are not the special generated sanitizer ones
      if (strcmp(base, "libclang_rt.tsan_iossim_dynamic.dylib") &&
          strcmp(base, "libclang_rt.asan_iossim_dynamic.dylib"))
      {
          _rebind_symbols_for_image(_dyld_get_image_header(i), _dyld_get_image_vmaddr_slide(i));
      }
    }
  }
  return retval;
}

A better fix would be to better identify these special compiler generated DLLs instead of hardcoding the file names.

@tirodkar
Copy link
Contributor

@grp Is there any plan for adding any of the suggested fixes? It would be incredibly advantageous for users trying to debug with sanitizers and fishhook.

@tirodkar
Copy link
Contributor

tirodkar commented Nov 10, 2020

A dirty hack that seems to work is:

static void _rebind_symbols_for_image(const struct mach_header *header,
                                      intptr_t slide) {
    uint32_t c = _dyld_image_count();
    for (uint32_t i = 0; i < c; i++) {
        // HACK: Get file name of the mach header
        if (_dyld_get_image_header(i) == header) {
            const char *path = _dyld_get_image_name(i);
            const char *base = basename((char *)path);

            // Only rebind libraries that are not the special generated sanitizer ones
            if (strcmp(base, "libclang_rt.tsan_iossim_dynamic.dylib") &&
                strcmp(base, "libclang_rt.asan_iossim_dynamic.dylib"))
            {
                _rebind_symbols_for_image(_dyld_get_image_header(i), _dyld_get_image_vmaddr_slide(i));
            }
        }
    }
}
int rebind_symbols(struct rebinding rebindings[], size_t rebindings_nel) {
  int retval = prepend_rebindings(&_rebindings_head, rebindings, rebindings_nel);
  if (retval < 0) {
    return retval;
  }
  // If this was the first call, register callback for image additions (which is also invoked for
  // existing images, otherwise, just run on existing images
  if (!_rebindings_head->next) {
    _dyld_register_func_for_add_image(_rebind_symbols_for_image);
  } else {
    uint32_t c = _dyld_image_count();
    for (uint32_t i = 0; i < c; i++) {
      const char *path = _dyld_get_image_name(i);
      const char *base = basename((char *)path);

      // Only rebind libraries that are not the special generated sanitizer ones
      if (strcmp(base, "libclang_rt.tsan_iossim_dynamic.dylib") &&
          strcmp(base, "libclang_rt.asan_iossim_dynamic.dylib"))
      {
          _rebind_symbols_for_image(_dyld_get_image_header(i), _dyld_get_image_vmaddr_slide(i));
      }
    }
  }
  return retval;
}

A better fix would be to better identify these special compiler generated DLLs instead of hardcoding the file names.

This causes a compilation issue with basename. Do you have a branch?

@LeoNatan
Copy link

LeoNatan commented Feb 9, 2021

@tirodkar For me, the following change worked:

Add

#include <libgen.h>

at the top section, then the following changes:

static void _rebind_symbols_for_image(const struct mach_header *header, intptr_t slide) {
	uint32_t c = _dyld_image_count();
	for (uint32_t i = 0; i < c; i++) {
		// HACK: Get file name of the mach header
		if (_dyld_get_image_header(i) == header) {
			const char *path = _dyld_get_image_name(i);
			const char *base = basename((char *)path);
			
			// Only rebind libraries that are not the special generated sanitizer ones
			if (strcmp(base, "libclang_rt.tsan_iossim_dynamic.dylib") &&
				strcmp(base, "libclang_rt.asan_iossim_dynamic.dylib"))
			{
				rebind_symbols_for_image(_rebindings_head, _dyld_get_image_header(i), _dyld_get_image_vmaddr_slide(i));
			}
			
			return;
		}
	}
}

(There was a bug in @dlow-yahoo-inc ‘s code above, but easy to fix.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants