Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot kill child process #97

Open
yoava333 opened this issue Mar 17, 2018 · 6 comments
Open

Cannot kill child process #97

yoava333 opened this issue Mar 17, 2018 · 6 comments

Comments

@yoava333
Copy link
Contributor

While fuzzing on a 24 core machine, every couple of hours afl-fuzz process crashes with the following message:

[-] PROGRAM ABORT : Cannot kill child process

         Location : destroy_target_process(), C:\work\fuzzing\winafl\afl-fuzz.c:2385

I have an open WinDBG windows with the following crash:


Microsoft (R) Windows Debugger Version 10.0.10586.567 X86
Copyright (c) Microsoft Corporation. All rights reserved.

*** wait with pending attach

************* Symbol Path validation summary **************
Response                         Time (ms)     Location
Deferred                                       symsrv*symsrv.dll*C:\WINDOWS\Symbols*http://msdl.microsoft.com/download/symbols
Symbol search path is: symsrv*symsrv.dll*C:\WINDOWS\Symbols*http://msdl.microsoft.com/download/symbols
Executable search path is: 
ModLoad: 013c0000 013c7000   R:\crash_test.exe
ModLoad: 770c0000 7724d000   C:\WINDOWS\SYSTEM32\ntdll.dll
ModLoad: 76cd0000 76da0000   C:\WINDOWS\System32\KERNEL32.DLL
ModLoad: 758f0000 75ac7000   C:\WINDOWS\System32\KERNELBASE.dll
ModLoad: 75b80000 75c97000   C:\WINDOWS\System32\ucrtbase.dll
ModLoad: 73ab0000 73ac5000   C:\WINDOWS\SYSTEM32\VCRUNTIME140.dll
Break-in sent, waiting 30 seconds...
WARNING: Break-in timed out, suspending.
         This is usually caused by another thread holding the loader lock
(1f7c.17c8): Wake debugger - code 80000007 (first chance)
eax=00000000 ebx=00c90000 ecx=00000000 edx=00000000 esi=0055f534 edi=00c90000
eip=73808c66 esp=0055f4d4 ebp=00000002 iopl=0         nv up ei pl nz na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000206
73808c66 8b0c24          mov     ecx,dword ptr [esp]  ss:002b:0055f4d4=73821c69
0:000> kb
 # ChildEBP RetAddr  Args to Child              
WARNING: Frame IP not in any known module. Following frames may be wrong.
00 0055f4d0 73821c69 ffffffff 00c97000 00000000 0x73808c66
01 0055f4f0 73822db8 ffffffff 00c97000 0055f534 0x73821c69
02 0055f50c 7381133d 00c97000 0055f534 0000001c 0x73822db8
03 0055f54c 7380bf25 00c9b000 0055f568 00000001 0x7381133d
04 00000000 00000000 00000000 00000000 00000000 0x7380bf25
0:000> u eip
73808c66 8b0c24          mov     ecx,dword ptr [esp]
73808c69 894c24fc        mov     dword ptr [esp-4],ecx
73808c6d 8d6424fc        lea     esp,[esp-4]
73808c71 c3              ret
73808c72 8da424e8feffff  lea     esp,[esp-118h]
73808c79 6a00            push    0
73808c7b 9c              pushfd
73808c7c 60              pushad

From what I can tell the debugger is having a hard time attaching ("waiting 30 seconds...") which means the process has the loader lock held (and cannot inject the debugger thread to the process). I'm not sure why it's happening.

I'm using DynamoRIO 7.0.17595-0 fuzzing a 32bit process on a Windows 10 1709 (16299).

@ivanfratric
Copy link
Contributor

I've seen this happen before but not with such frequency (it was a matter of days and not hours for me). Possibly it depends on the target, but I don't really know the cause.

@ivanfratric
Copy link
Contributor

Can you see if it's any better with DynamoRIO 6.2.0-2?

@yoava333
Copy link
Contributor Author

TL;DR - upgrading to cronbuild-7.0.17605 seems to solve the issue.

I tested DynamoRIO 6.2.0-2 and this problem reproduced along with another one which is clearly a bug in the instrumentation:

Copyright (c) Microsoft Corporation. All rights reserved.

*** wait with pending attach

************* Path validation summary **************
Response                         Time (ms)     Location
Deferred                                       srv*C:\Symbols*http://msdl.microsoft.com/download/symbols
Symbol search path is: srv*C:\Symbols*http://msdl.microsoft.com/download/symbols
Executable search path is: 
ModLoad: 00c10000 00c17000   R:\crash_test.exe
ModLoad: 77030000 771bd000   C:\WINDOWS\SYSTEM32\ntdll.dll
ModLoad: 75b00000 75bd0000   C:\WINDOWS\System32\KERNEL32.DLL
ModLoad: 74470000 74647000   C:\WINDOWS\System32\KERNELBASE.dll
ModLoad: 74890000 749a7000   C:\WINDOWS\System32\ucrtbase.dll
ModLoad: 6f100000 6f115000   C:\WINDOWS\SYSTEM32\VCRUNTIME140.dll
(4fc.850): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=770da650 ebx=00000000 ecx=00000000 edx=00000000 esi=00000000 edi=00000000
eip=6efdd835 esp=0307f704 ebp=00000000 iopl=0         nv up ei pl nz na po nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010202
6efdd835 54              push    esp
0:003> u eip
6efdd835 54              push    esp
6efdd836 6803000000      push    3
6efdd83b 8da424e8feffff  lea     esp,[esp-118h]
6efdd842 c5fe7f442418    vmovdqu ymmword ptr [esp+18h],ymm0
6efdd848 c5fe7f4c2438    vmovdqu ymmword ptr [esp+38h],ymm1
6efdd84e c5fe7f542458    vmovdqu ymmword ptr [esp+58h],ymm2
6efdd854 c5fe7f5c2478    vmovdqu ymmword ptr [esp+78h],ymm3
6efdd85a c5fe7fa42498000000 vmovdqu ymmword ptr [esp+98h],ymm4
0:003> !address eip

Building memory map: 00000000
Mapping file section regions...
Mapping module regions...
Mapping PEB regions...
Mapping TEB and stack regions...
Mapping heap regions...
Mapping page heap regions...
Mapping other regions...
Mapping stack trace database regions...
Mapping activation context regions...

Usage:                  <unknown>
Base Address:           6efad000
End Address:            6efea000
Region Size:            0003d000 ( 244.000 kB)
State:                  00001000          MEM_COMMIT
Protect:                00000004          PAGE_READWRITE
Type:                   01000000          MEM_IMAGE
Allocation Base:        6eea0000
Allocation Protect:     00000080          PAGE_EXECUTE_WRITECOPY


Content source: 1 (target), length: c7cb

As you can see, the page we jump to (which is instrumentation code) doesn't have execute permissions.
I tried running WinAFL with the latest DynamoRIO build cronbuild-7.0.17605 and it seems to solve the problem. There are 695 commits in between release_6_2_0..origin/master, I skimmed through the commit messages but couldn't find something indicative for solving both crashes.

@ivanfratric
Copy link
Contributor

Thanks for the info, that's good to know!

@hatRiot
Copy link

hatRiot commented Apr 8, 2019

Not sure if related, but I ran into this as well. Turned out get_test_case was missing a FD close:

$ git diff afl-fuzz.c
diff --git a/afl-fuzz.c b/afl-fuzz.c
index 28ec379..7b4e195 100644
--- a/afl-fuzz.c
+++ b/afl-fuzz.c
@@ -2539,6 +2539,9 @@ char *get_test_case(long *fsize)
   char *buf = malloc(*fsize);
   ck_read(fd, buf, *fsize, "input file");

+  if(out_file != NULL)
+    close(fd);
+
   return buf;
 }

This fixed the issue for me.

@ifratric
Copy link
Collaborator

ifratric commented Apr 9, 2019

Hi @hatRiot, thanks you very much for the heads up - that indeed looks like a bug in get_test_case. I applied your patch.

However, note that get_test_case is only called from process_test_case_into_dll which is only used if a custom sample processing dll is used, so this can only be the root cause if you are using a custom dll (-l flag) and custom output file (-f flag).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants