Skip to content

Commit

Permalink
2.29b
Browse files Browse the repository at this point in the history
  • Loading branch information
thomas-huet committed Aug 14, 2016
1 parent b7a4a5f commit 323560e
Show file tree
Hide file tree
Showing 7 changed files with 61 additions and 40 deletions.
2 changes: 1 addition & 1 deletion afl-fuzz.c
Original file line number Diff line number Diff line change
Expand Up @@ -7961,7 +7961,7 @@ int main(int argc, char** argv) {
stop_fuzzing:

SAYF(CURSOR_SHOW cLRD "\n\n+++ Testing aborted %s +++\n" cRST,
stop_soon == 2 ? "programatically" : "by user");
stop_soon == 2 ? "programmatically" : "by user");

/* Running for more than 30 minutes but still doing first cycle? */

Expand Down
2 changes: 1 addition & 1 deletion config.h
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@

/* Version string: */

#define VERSION "2.28b"
#define VERSION "2.29b"

/******************************************************
* *
Expand Down
8 changes: 8 additions & 0 deletions docs/ChangeLog
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,14 @@ Not sure if you should upgrade? The lowest currently recommended version
is 2.23b. If you're stuck on an earlier release, it's strongly advisable
to get on with the times.

--------------
Version 2.29b:
--------------

- Made a minor #include fix to llvm_mode. Suggested by Jonathan Metzman.

- Made cosmetic updates to the docs.

--------------
Version 2.28b:
--------------
Expand Down
2 changes: 1 addition & 1 deletion docs/QuickStartGuide.txt
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ how to hit the ground running:

4) Get a small but valid input file that makes sense to the program. When
fuzzing verbose syntax (SQL, HTTP, etc), create a dictionary as described in
testcases/README.testcases, too.
dictionaries/README.dictionaries, too.

5) If the program reads from stdin, run 'afl-fuzz' like so:

Expand Down
9 changes: 2 additions & 7 deletions docs/life_pro_tips.txt
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ Run the bundled afl-plot utility to generate browser-friendly graphs.

%

Need to monitor AFL jobs programatically? Check out the fuzzer_stats file
Need to monitor AFL jobs programmatically? Check out the fuzzer_stats file
in the AFL output dir or try afl-whatsup.

%
Expand Down Expand Up @@ -62,11 +62,6 @@ Try the bundled afl-tmin tool - and get small repro files fast!

%

Need to fix a checksum? It's easy to do with an output postprocessor!
See experimental/post_library to learn more.

%

Not sure if a crash is exploitable? AFL can help you figure it out. Specify
-C to enable the peruvian were-rabbit mode. See section #10 in README for more.

Expand Down Expand Up @@ -122,7 +117,7 @@ You can find a simple solution in experimental/argv_fuzzing.

%

Attacking a format that uses checksums? Remove the checksum code or
Attacking a format that uses checksums? Remove the checksum-checking code or
use a postprocessor! See experimental/post_library/ for more.

%
77 changes: 47 additions & 30 deletions docs/technical_details.txt
Original file line number Diff line number Diff line change
Expand Up @@ -85,8 +85,8 @@ of dword- or qword-wide instructions and a simple loop.
When a mutated input produces an execution trace containing new tuples, the
corresponding input file is preserved and routed for additional processing
later on (see section #3). Inputs that do not trigger new local-scale state
transitions in the execution trace are discarded, even if their overall
instrumentation output pattern is unique.
transitions in the execution trace (i.e., produce no new tuples) are discarded,
even if their overall control flow sequence is unique.

This approach allows for a very fine-grained and long-term exploration of
program state while not having to perform any computationally intensive and
Expand All @@ -101,7 +101,7 @@ new tuples (CA, AE):
#2: A -> B -> C -> A -> E

At the same time, with #2 processed, the following pattern will not be seen
as unique, despite having a markedly different execution path:
as unique, despite having a markedly different overall execution path:

#3: A -> B -> C -> A -> B -> C -> A -> B -> C -> D -> E

Expand Down Expand Up @@ -142,9 +142,9 @@ Mutated test cases that produced new state transitions within the program are
added to the input queue and used as a starting point for future rounds of
fuzzing. They supplement, but do not automatically replace, existing finds.

This approach allows the tool to progressively explore various disjoint and
possibly mutually incompatible features of the underlying data format, as
shown in this image:
In contrast to more greedy genetic algorithms, this approach allows the tool
to progressively explore various disjoint and possibly mutually incompatible
features of the underlying data format, as shown in this image:

http://lcamtuf.coredump.cx/afl/afl_gzip.png

Expand Down Expand Up @@ -201,10 +201,10 @@ the sessions were seeded with a valid unified diff:
Edge coverage | 1,259 | 1,734 | 1.72 | 0
AFL model | 1,452 | 2,040 | 3.16 | 1

Some of the earlier work on evolutionary fuzzing suggested maintaining just a
single test case and selecting for mutations that improve coverage. At least
in the tests described above, this "greedy" method appeared to offer no
substantial benefits over blind fuzzing.
At noted earlier on, some of the prior work on genetic fuzzing relied on
maintaining a single test case and evolving it to maximize coverage. At least
in the tests described above, this "greedy" approach appears to confer no
substantial benefits over blind fuzzing strategies.

4) Culling the corpus
---------------------
Expand Down Expand Up @@ -263,9 +263,9 @@ files make the target binary slower, and because they reduce the likelihood
that a mutation would touch important format control structures, rather than
redundant data blocks. This is discussed in more detail in perf_tips.txt.

The possibility of a bad starting corpus provided by the user aside, some
types of mutations can have the effect of iteratively increasing the size of
the generated files, so it is important to counter this trend.
The possibility that the user will provide a low-quality starting corpus aside,
some types of mutations can have the effect of iteratively increasing the size
of the generated files, so it is important to counter this trend.

Luckily, the instrumentation feedback provides a simple way to automatically
trim down input files while ensuring that the changes made to the files have no
Expand All @@ -275,8 +275,8 @@ The built-in trimmer in afl-fuzz attempts to sequentially remove blocks of data
with variable length and stepover; any deletion that doesn't affect the checksum
of the trace map is committed to disk. The trimmer is not designed to be
particularly thorough; instead, it tries to strike a balance between precision
and the number of execve() calls spent on the process. The average per-file
gains are around 5-20%.
and the number of execve() calls spent on the process, selecting the block size
and stepover to match. The average per-file gains are around 5-20%.

The standalone afl-tmin tool uses a more exhaustive, iterative algorithm, and
also attempts to perform alphabet normalization on the trimmed files.
Expand All @@ -302,11 +302,16 @@ strategies include:

- Sequential insertion of known interesting integers (0, 1, INT_MAX, etc),

The non-deterministic steps include stacked bit flips, insertions, deletions,
arithmetics, and splicing of different test cases.
The purpose of opening with deterministic steps is related to their tendency to
produce compact test cases and small diffs between the non-crashing and crashing
inputs.

Their relative yields and execve() costs have been investigated and are
discussed in the aforementioned blog post.
With deterministic fuzzing out of the way, the non-deterministic steps include
stacked bit flips, insertions, deletions, arithmetics, and splicing of different
test cases.

The relative yields and execve() costs of all these strategies have been
investigated and are discussed in the aforementioned blog post.

For the reasons discussed in historical_notes.txt (chiefly, performance,
simplicity, and reliability), AFL generally does not try to reason about the
Expand All @@ -315,13 +320,14 @@ are nominally blind, and are guided only by the evolutionary design of the
input queue.

That said, there is one (trivial) exception to this rule: when a new queue
entry goes through the initial set of deterministic fuzzing steps, and some
regions in the file are observed to have no effect on the checksum of the
entry goes through the initial set of deterministic fuzzing steps, and tweaks to
some regions in the file are observed to have no effect on the checksum of the
execution path, they may be excluded from the remaining phases of
deterministic fuzzing - and proceed straight to random tweaks. Especially for
verbose, human-readable data formats, this can reduce the number of execs by
10-40% or so without an appreciable drop in coverage. In extreme cases, such
as normally block-aligned tar archives, the gains can be as high as 90%.
deterministic fuzzing - and the fuzzer may proceed straight to random tweaks.
Especially for verbose, human-readable data formats, this can reduce the number
of execs by 10-40% or so without an appreciable drop in coverage. In extreme
cases, such as normally block-aligned tar archives, the gains can be as high as
90%.

Because the underlying "effector maps" are local every queue entry and remain
in force only during deterministic stages that do not alter the size or the
Expand Down Expand Up @@ -353,6 +359,14 @@ the grammar of highly verbose and complex languages such as JavaScript, SQL,
or XML; several examples of generated SQL statements are given in the blog
post mentioned above.

Interestingly, the AFL instrumentation also allows the fuzzer to automatically
isolate syntax tokens already present in an input file. It can do so by looking
for run of bytes that, when flipped, produce a consistent change to the
program's execution path; this is suggestive of an underlying atomic comparison
to a predefined value baked into the code. The fuzzer relies on this signal
to build compact "auto dictionaries" that are then used in conjunction with
other fuzzing strategies.

8) De-duping crashes
--------------------

Expand Down Expand Up @@ -416,22 +430,25 @@ With fast targets, the fork server can offer considerable performance gains,
usually between 1.5x and 2x. It is also possible to:

- Use the fork server in manual ("deferred") mode, skipping over larger,
user-selected chunks of initialization code. With some targets, this can
user-selected chunks of initialization code. It requires very modest
code changes to the targeted program, and With some targets, can
produce 10x+ performance gains.

- Enable "persistent" mode, where a single process is used to try out
multiple inputs, greatly limiting the overhead of repetitive fork()
calls. As with the previous mode, this requires custom modifications,
calls. This generally requires some code changes to the targeted program,
but can improve the performance of fast targets by a factor of 5 or more
- approximating the benefits of in-process fuzzing jobs.
- approximating the benefits of in-process fuzzing jobs while still
maintaining very robust isolation between the fuzzer process and the
targeted binary.

11) Parallelization
-------------------

The parallelization mechanism relies on periodically examining the queues
produced by independently-running instances on other CPU cores or on remote
machines, and then selectively pulling in the test cases that produce behaviors
not yet seen by the fuzzer at hand.
machines, and then selectively pulling in the test cases that, when tried
out locally, produce behaviors not yet seen by the fuzzer at hand.

This allows for extreme flexibility in fuzzer setup, including running synced
instances against different parsers of a common data format, often with
Expand Down
1 change: 1 addition & 0 deletions llvm_mode/afl-llvm-rt.o.c
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
#include <stdlib.h>
#include <signal.h>
#include <unistd.h>
#include <string.h>
#include <assert.h>

#include <sys/mman.h>
Expand Down

0 comments on commit 323560e

Please sign in to comment.