diff --git a/afl-fuzz.c b/afl-fuzz.c index 0feb000f..4a863020 100644 --- a/afl-fuzz.c +++ b/afl-fuzz.c @@ -7961,7 +7961,7 @@ int main(int argc, char** argv) { stop_fuzzing: SAYF(CURSOR_SHOW cLRD "\n\n+++ Testing aborted %s +++\n" cRST, - stop_soon == 2 ? "programatically" : "by user"); + stop_soon == 2 ? "programmatically" : "by user"); /* Running for more than 30 minutes but still doing first cycle? */ diff --git a/config.h b/config.h index 39313450..c24627ff 100644 --- a/config.h +++ b/config.h @@ -21,7 +21,7 @@ /* Version string: */ -#define VERSION "2.28b" +#define VERSION "2.29b" /****************************************************** * * diff --git a/docs/ChangeLog b/docs/ChangeLog index 110c864b..f38b19a8 100644 --- a/docs/ChangeLog +++ b/docs/ChangeLog @@ -16,6 +16,14 @@ Not sure if you should upgrade? The lowest currently recommended version is 2.23b. If you're stuck on an earlier release, it's strongly advisable to get on with the times. +-------------- +Version 2.29b: +-------------- + + - Made a minor #include fix to llvm_mode. Suggested by Jonathan Metzman. + + - Made cosmetic updates to the docs. + -------------- Version 2.28b: -------------- diff --git a/docs/QuickStartGuide.txt b/docs/QuickStartGuide.txt index 61541466..abe7032f 100644 --- a/docs/QuickStartGuide.txt +++ b/docs/QuickStartGuide.txt @@ -27,7 +27,7 @@ how to hit the ground running: 4) Get a small but valid input file that makes sense to the program. When fuzzing verbose syntax (SQL, HTTP, etc), create a dictionary as described in - testcases/README.testcases, too. + dictionaries/README.dictionaries, too. 5) If the program reads from stdin, run 'afl-fuzz' like so: diff --git a/docs/life_pro_tips.txt b/docs/life_pro_tips.txt index e97d32dc..df053a8e 100644 --- a/docs/life_pro_tips.txt +++ b/docs/life_pro_tips.txt @@ -33,7 +33,7 @@ Run the bundled afl-plot utility to generate browser-friendly graphs. % -Need to monitor AFL jobs programatically? Check out the fuzzer_stats file +Need to monitor AFL jobs programmatically? Check out the fuzzer_stats file in the AFL output dir or try afl-whatsup. % @@ -62,11 +62,6 @@ Try the bundled afl-tmin tool - and get small repro files fast! % -Need to fix a checksum? It's easy to do with an output postprocessor! -See experimental/post_library to learn more. - -% - Not sure if a crash is exploitable? AFL can help you figure it out. Specify -C to enable the peruvian were-rabbit mode. See section #10 in README for more. @@ -122,7 +117,7 @@ You can find a simple solution in experimental/argv_fuzzing. % -Attacking a format that uses checksums? Remove the checksum code or +Attacking a format that uses checksums? Remove the checksum-checking code or use a postprocessor! See experimental/post_library/ for more. % diff --git a/docs/technical_details.txt b/docs/technical_details.txt index af037355..3ec48741 100644 --- a/docs/technical_details.txt +++ b/docs/technical_details.txt @@ -85,8 +85,8 @@ of dword- or qword-wide instructions and a simple loop. When a mutated input produces an execution trace containing new tuples, the corresponding input file is preserved and routed for additional processing later on (see section #3). Inputs that do not trigger new local-scale state -transitions in the execution trace are discarded, even if their overall -instrumentation output pattern is unique. +transitions in the execution trace (i.e., produce no new tuples) are discarded, +even if their overall control flow sequence is unique. This approach allows for a very fine-grained and long-term exploration of program state while not having to perform any computationally intensive and @@ -101,7 +101,7 @@ new tuples (CA, AE): #2: A -> B -> C -> A -> E At the same time, with #2 processed, the following pattern will not be seen -as unique, despite having a markedly different execution path: +as unique, despite having a markedly different overall execution path: #3: A -> B -> C -> A -> B -> C -> A -> B -> C -> D -> E @@ -142,9 +142,9 @@ Mutated test cases that produced new state transitions within the program are added to the input queue and used as a starting point for future rounds of fuzzing. They supplement, but do not automatically replace, existing finds. -This approach allows the tool to progressively explore various disjoint and -possibly mutually incompatible features of the underlying data format, as -shown in this image: +In contrast to more greedy genetic algorithms, this approach allows the tool +to progressively explore various disjoint and possibly mutually incompatible +features of the underlying data format, as shown in this image: http://lcamtuf.coredump.cx/afl/afl_gzip.png @@ -201,10 +201,10 @@ the sessions were seeded with a valid unified diff: Edge coverage | 1,259 | 1,734 | 1.72 | 0 AFL model | 1,452 | 2,040 | 3.16 | 1 -Some of the earlier work on evolutionary fuzzing suggested maintaining just a -single test case and selecting for mutations that improve coverage. At least -in the tests described above, this "greedy" method appeared to offer no -substantial benefits over blind fuzzing. +At noted earlier on, some of the prior work on genetic fuzzing relied on +maintaining a single test case and evolving it to maximize coverage. At least +in the tests described above, this "greedy" approach appears to confer no +substantial benefits over blind fuzzing strategies. 4) Culling the corpus --------------------- @@ -263,9 +263,9 @@ files make the target binary slower, and because they reduce the likelihood that a mutation would touch important format control structures, rather than redundant data blocks. This is discussed in more detail in perf_tips.txt. -The possibility of a bad starting corpus provided by the user aside, some -types of mutations can have the effect of iteratively increasing the size of -the generated files, so it is important to counter this trend. +The possibility that the user will provide a low-quality starting corpus aside, +some types of mutations can have the effect of iteratively increasing the size +of the generated files, so it is important to counter this trend. Luckily, the instrumentation feedback provides a simple way to automatically trim down input files while ensuring that the changes made to the files have no @@ -275,8 +275,8 @@ The built-in trimmer in afl-fuzz attempts to sequentially remove blocks of data with variable length and stepover; any deletion that doesn't affect the checksum of the trace map is committed to disk. The trimmer is not designed to be particularly thorough; instead, it tries to strike a balance between precision -and the number of execve() calls spent on the process. The average per-file -gains are around 5-20%. +and the number of execve() calls spent on the process, selecting the block size +and stepover to match. The average per-file gains are around 5-20%. The standalone afl-tmin tool uses a more exhaustive, iterative algorithm, and also attempts to perform alphabet normalization on the trimmed files. @@ -302,11 +302,16 @@ strategies include: - Sequential insertion of known interesting integers (0, 1, INT_MAX, etc), -The non-deterministic steps include stacked bit flips, insertions, deletions, -arithmetics, and splicing of different test cases. +The purpose of opening with deterministic steps is related to their tendency to +produce compact test cases and small diffs between the non-crashing and crashing +inputs. -Their relative yields and execve() costs have been investigated and are -discussed in the aforementioned blog post. +With deterministic fuzzing out of the way, the non-deterministic steps include +stacked bit flips, insertions, deletions, arithmetics, and splicing of different +test cases. + +The relative yields and execve() costs of all these strategies have been +investigated and are discussed in the aforementioned blog post. For the reasons discussed in historical_notes.txt (chiefly, performance, simplicity, and reliability), AFL generally does not try to reason about the @@ -315,13 +320,14 @@ are nominally blind, and are guided only by the evolutionary design of the input queue. That said, there is one (trivial) exception to this rule: when a new queue -entry goes through the initial set of deterministic fuzzing steps, and some -regions in the file are observed to have no effect on the checksum of the +entry goes through the initial set of deterministic fuzzing steps, and tweaks to +some regions in the file are observed to have no effect on the checksum of the execution path, they may be excluded from the remaining phases of -deterministic fuzzing - and proceed straight to random tweaks. Especially for -verbose, human-readable data formats, this can reduce the number of execs by -10-40% or so without an appreciable drop in coverage. In extreme cases, such -as normally block-aligned tar archives, the gains can be as high as 90%. +deterministic fuzzing - and the fuzzer may proceed straight to random tweaks. +Especially for verbose, human-readable data formats, this can reduce the number +of execs by 10-40% or so without an appreciable drop in coverage. In extreme +cases, such as normally block-aligned tar archives, the gains can be as high as +90%. Because the underlying "effector maps" are local every queue entry and remain in force only during deterministic stages that do not alter the size or the @@ -353,6 +359,14 @@ the grammar of highly verbose and complex languages such as JavaScript, SQL, or XML; several examples of generated SQL statements are given in the blog post mentioned above. +Interestingly, the AFL instrumentation also allows the fuzzer to automatically +isolate syntax tokens already present in an input file. It can do so by looking +for run of bytes that, when flipped, produce a consistent change to the +program's execution path; this is suggestive of an underlying atomic comparison +to a predefined value baked into the code. The fuzzer relies on this signal +to build compact "auto dictionaries" that are then used in conjunction with +other fuzzing strategies. + 8) De-duping crashes -------------------- @@ -416,22 +430,25 @@ With fast targets, the fork server can offer considerable performance gains, usually between 1.5x and 2x. It is also possible to: - Use the fork server in manual ("deferred") mode, skipping over larger, - user-selected chunks of initialization code. With some targets, this can + user-selected chunks of initialization code. It requires very modest + code changes to the targeted program, and With some targets, can produce 10x+ performance gains. - Enable "persistent" mode, where a single process is used to try out multiple inputs, greatly limiting the overhead of repetitive fork() - calls. As with the previous mode, this requires custom modifications, + calls. This generally requires some code changes to the targeted program, but can improve the performance of fast targets by a factor of 5 or more - - approximating the benefits of in-process fuzzing jobs. + - approximating the benefits of in-process fuzzing jobs while still + maintaining very robust isolation between the fuzzer process and the + targeted binary. 11) Parallelization ------------------- The parallelization mechanism relies on periodically examining the queues produced by independently-running instances on other CPU cores or on remote -machines, and then selectively pulling in the test cases that produce behaviors -not yet seen by the fuzzer at hand. +machines, and then selectively pulling in the test cases that, when tried +out locally, produce behaviors not yet seen by the fuzzer at hand. This allows for extreme flexibility in fuzzer setup, including running synced instances against different parsers of a common data format, often with diff --git a/llvm_mode/afl-llvm-rt.o.c b/llvm_mode/afl-llvm-rt.o.c index 62768d6f..5ac8861b 100644 --- a/llvm_mode/afl-llvm-rt.o.c +++ b/llvm_mode/afl-llvm-rt.o.c @@ -26,6 +26,7 @@ #include #include #include +#include #include #include