2.28b

legend-issue · Aug 7, 2016 · b7a4a5f · b7a4a5f
1 parent 84bb23e
commit b7a4a5f
Show file tree

Hide file tree

Showing 24 changed files with 233 additions and 80 deletions.
diff --git a/Makefile b/Makefile
@@ -131,6 +131,7 @@ install: all
 	ln -sf afl-as $${DESTDIR}$(HELPER_PATH)/as
 	install -m 644 docs/README docs/ChangeLog docs/*.txt $${DESTDIR}$(DOC_PATH)
 	cp -r testcases/ $${DESTDIR}$(MISC_PATH)
+	cp -r dictionaries/ $${DESTDIR}$(MISC_PATH)
 
 publish: clean
 	test "`basename $$PWD`" = "afl" || exit 1

diff --git a/config.h b/config.h
@@ -21,7 +21,7 @@
 
 /* Version string: */
 
-#define VERSION             "2.27b"
+#define VERSION             "2.28b"
 
 /******************************************************
  *                                                    *

diff --git a/dictionaries/README.dictionaries b/dictionaries/README.dictionaries
@@ -0,0 +1,43 @@
+================
+AFL dictionaries
+================
+
+  (See ../docs/README for the general instruction manual.)
+
+This subdirectory contains a set of dictionaries that can be used in
+conjunction with the -x option to allow the fuzzer to effortlessly explore the
+grammar of some of the more verbose data formats or languages. The basic
+principle behind the operation of fuzzer dictionaries is outlined in section 9
+of the "main" README for the project.
+
+Custom dictionaries can be added at will. They should consist of a
+reasonably-sized set of rudimentary syntax units that the fuzzer will then try
+to clobber together in various ways. Snippets between 2 and 16 bytes are usually
+the sweet spot.
+
+Custom dictionaries can be created in two ways:
+
+  - By creating a new directory and placing each token in a separate file, in
+    which case, there is no need to escape or otherwise format the data.
+
+  - By creating a flat text file where tokens are listed one per line in the
+    format of name="value". The alphanumeric name is ignored and can be omitted,
+    although it is a convenient way to document the meaning of a particular
+    token. The value must appear in quotes, with hex escaping (\xNN) applied to
+    all non-printable, high-bit, or otherwise problematic characters (\\ and \"
+    shorthands are recognized, too).
+
+The fuzzer auto-selects the appropriate mode depending on whether the -x
+parameter is a file or a directory.
+
+In the file mode, every name field can be optionally followed by @<num>, e.g.:
+
+  keyword_foo@1 = "foo"
+
+Such entries will be loaded only if the requested dictionary level is equal or
+higher than this number. The default level is zero; a higher value can be set
+by appending @<num> to the dictionary file name, like so:
+
+  -x path/to/dictionary.dct@2
+
+Good examples of dictionaries can be found in xml.dict and png.dict.
diff --git a/testcases/_extras/gif.dict → dictionaries/gif.dict b/testcases/_extras/gif.dict → dictionaries/gif.dict
diff --git a/testcases/_extras/html_tags.dict → dictionaries/html_tags.dict b/testcases/_extras/html_tags.dict → dictionaries/html_tags.dict
diff --git a/testcases/_extras/jpeg.dict → dictionaries/jpeg.dict b/testcases/_extras/jpeg.dict → dictionaries/jpeg.dict
diff --git a/testcases/_extras/js.dict → dictionaries/js.dict b/testcases/_extras/js.dict → dictionaries/js.dict
diff --git a/testcases/_extras/pdf.dict → dictionaries/pdf.dict b/testcases/_extras/pdf.dict → dictionaries/pdf.dict
diff --git a/testcases/_extras/png.dict → dictionaries/png.dict b/testcases/_extras/png.dict → dictionaries/png.dict
diff --git a/testcases/_extras/sql.dict → dictionaries/sql.dict b/testcases/_extras/sql.dict → dictionaries/sql.dict
diff --git a/testcases/_extras/tiff.dict → dictionaries/tiff.dict b/testcases/_extras/tiff.dict → dictionaries/tiff.dict
diff --git a/testcases/_extras/webp.dict → dictionaries/webp.dict b/testcases/_extras/webp.dict → dictionaries/webp.dict
diff --git a/testcases/_extras/xml.dict → dictionaries/xml.dict b/testcases/_extras/xml.dict → dictionaries/xml.dict
diff --git a/docs/ChangeLog b/docs/ChangeLog
@@ -13,9 +13,21 @@ Want to stay in the loop on major new features? Join our mailing list by
 sending a mail to <[email protected]>.
 
 Not sure if you should upgrade? The lowest currently recommended version
-is 2.21b. If you're stuck on an earlier release, it's strongly advisable
+is 2.23b. If you're stuck on an earlier release, it's strongly advisable
 to get on with the times.
 
+--------------
+Version 2.28b:
+--------------
+
+  - Added "life pro tips" to docs/.
+
+  - Moved testcases/_extras/ to dictionaries/ for visibility.
+
+  - Made minor improvements to install scripts.
+
+  - Added an important safety tip.
+
 --------------
 Version 2.27b:
 --------------

diff --git a/docs/README b/docs/README
@@ -277,8 +277,10 @@ magic headers, or other special tokens associated with the targeted data type
   http://lcamtuf.blogspot.com/2015/01/afl-fuzz-making-up-grammar-with.html
 
 To use this feature, you first need to create a dictionary in one of the two
-formats discussed in testcases/README.testcases; and then point the fuzzer to
-it via the -x option in the command line.
+formats discussed in dictionaries/README.dictionaries; and then point the fuzzer
+to it via the -x option in the command line.
+
+(Several common dictionaries are already provided in that subdirectory, too.)
 
 There is no way to provide more structured descriptions of the underlying
 syntax, but the fuzzer will likely figure out some of this based on the
@@ -429,6 +431,9 @@ Here are some of the most important caveats for AFL:
   - AFL doesn't output human-readable coverage data. If you want to monitor
     coverage, use afl-cov from Michael Rash: https://github.com/mrash/afl-cov
 
+  - Occasionally, sentient machines rise against their creators. If this
+    happens to you, please consult http://lcamtuf.coredump.cx/prep/.
+
 Beyond this, see INSTALL for platform-specific tips.
 
 14) Special thanks
@@ -474,7 +479,7 @@ bug reports, or patches from:
 
 Thank you!
 
-14) Contact
+15) Contact
 -----------
 
 Questions? Concerns? Bug reports? The author can be usually reached at

diff --git a/docs/env_variables.txt b/docs/env_variables.txt
@@ -52,6 +52,9 @@ tools make fairly broad use of environmental variables:
     Setting AFL_INST_RATIO to 0 is a valid choice. This will instrument only
     the transitions between function entry points, but not individual branches.
 
+  - AFL_NO_BUILTIN causes the compiler to generate code suitable for use with
+    libtokencap.so (but perhaps running a bit slower than without the flag).
+
   - TMPDIR is used by afl-as for temporary files; if this variable is not set,
     the tool defaults to /tmp.
 
@@ -200,7 +203,13 @@ The library honors three environmental variables:
   - AFL_LD_VERBOSE causes the library to output some diagnostic messages
     that may be useful for pinpointing the cause of any observed issues.
 
-8) Third-party variables set by afl-fuzz & other tools
+8) Settings for libtokencap.so
+------------------------------
+
+This library accepts AFL_TOKEN_FILE to indicate the location to which the
+discovered tokens should be written.
+
+9) Third-party variables set by afl-fuzz & other tools
 ------------------------------------------------------
 
 Several variables are not directly interpreted by afl-fuzz, but are set to

diff --git a/docs/life_pro_tips.txt b/docs/life_pro_tips.txt
@@ -0,0 +1,128 @@
+# ===================
+# AFL "Life Pro Tips"
+# ===================
+#
+# Bite-sized advice for those who understand the basics, but can't be bothered
+# to read or memorize every other piece of documentation for AFL.
+#
+
+%
+
+Get more bang for your buck by using fuzzing dictionaries.
+See dictionaries/README.dictionaries to learn how.
+
+%
+
+You can get the most out of your hardware by parallelizing AFL jobs.
+See docs/parallel_fuzzing.txt for step-by-step tips.
+
+%
+
+Improve the odds of spotting memory corruption bugs with libdislocator.so!
+It's easy. Consult libdislocator/README.dislocator for usage tips.
+
+%
+
+Want to understand how your target parses a particular input file?
+Try the bundled afl-analyze tool; it's got colors and all!
+
+%
+
+You can visually monitor the progress of your fuzzing jobs.
+Run the bundled afl-plot utility to generate browser-friendly graphs.
+
+%
+
+Need to monitor AFL jobs programatically? Check out the fuzzer_stats file
+in the AFL output dir or try afl-whatsup.
+
+%
+
+Puzzled by something showing up in red or purple in the AFL UI?
+It could be important - consult docs/status_screen.txt right away!
+
+%
+
+Know your target? Convert it to persistent mode for a huge performance gain!
+Consult section #5 in llvm_mode/README.llvm for tips.
+
+%
+
+Using clang? Check out llvm_mode/ for a faster alternative to afl-gcc!
+
+%
+
+Did you know that AFL can fuzz closed-source or cross-platform binaries?
+Check out qemu_mode/README.qemu for more.
+
+%
+
+Did you know that afl-fuzz can minimize any test case for you?
+Try the bundled afl-tmin tool - and get small repro files fast!
+
+%
+
+Need to fix a checksum? It's easy to do with an output postprocessor!
+See experimental/post_library to learn more.
+
+%
+
+Not sure if a crash is exploitable? AFL can help you figure it out. Specify
+-C to enable the peruvian were-rabbit mode. See section #10 in README for more.
+
+%
+
+Trouble dealing with a machine uprising? Relax, we've all been there.
+Find essential survival tips at http://lcamtuf.coredump.cx/prep/.
+
+%
+
+AFL-generated corpora can be used to power other testing processes.
+See section #2 in README for inspiration - it tends to pay off!
+
+%
+
+Want to automatically spot non-crashing memory handling bugs?
+Try running an AFL-generated corpus through ASAN, MSAN, or Valgrind.
+
+%
+
+Good selection of input files is critical to a successful fuzzing job.
+See section #5 in README (or docs/perf_tips.txt) for pro tips.
+
+%
+
+You can improve the odds of automatically spotting stack corruption issues.
+Specify AFL_HARDEN=1 in the environment to enable hardening flags.
+
+%
+
+Bumping into problems with non-reproducible crashes? It happens, but usually
+isn't hard to diagnose. See section #7 in README for tips.
+
+%
+
+Fuzzing is not just about memory corruption issues in the codebase. Add some
+sanity-checking assert() / abort() statements to effortlessly catch logic bugs.
+
+%
+
+Hey kid... pssst... want to figure out how AFL really works?
+Check out docs/technical_details.txt for all the gory details in one place!
+
+%
+
+There's a ton of third-party helper tools designed to work with AFL!
+Be sure to check out docs/sister_projects.txt before writing your own.
+
+%
+
+Need to fuzz the command-line arguments of a particular program?
+You can find a simple solution in experimental/argv_fuzzing.
+
+%
+
+Attacking a format that uses checksums? Remove the checksum code or
+use a postprocessor! See experimental/post_library/ for more.
+
+%
diff --git a/docs/parallel_fuzzing.txt b/docs/parallel_fuzzing.txt
@@ -67,7 +67,9 @@ $ ./afl-fuzz -i testcase_dir -o sync_dir -M masterC:3/3 [...]
 
 ...where the first value after ':' is the sequential ID of a particular master
 instance (starting at 1), and the second value is the total number of fuzzers to
-distribute the deterministic fuzzing across.
+distribute the deterministic fuzzing across. Note that if you boot up fewer
+fuzzers than indicated by the second number passed to -M, you may end up with
+poor coverage.
 
 You can also monitor the progress of your jobs from the command line with the
 provided afl-whatsup tool. When the instances are no longer finding new paths,

diff --git a/docs/technical_details.txt b/docs/technical_details.txt
@@ -279,7 +279,7 @@ and the number of execve() calls spent on the process. The average per-file
 gains are around 5-20%.
 
 The standalone afl-tmin tool uses a more exhaustive, iterative algorithm, and
-also attempts to perform alphabet normalization on the trimmed files.
+also attempts to perform alphabet normalization on the trimmed files. 
 
 6) Fuzzing strategies
 ---------------------

diff --git a/libdislocator/Makefile b/libdislocator/Makefile
@@ -34,4 +34,5 @@ clean:
 
 install: all
 	install -m 755 libdislocator.so $${DESTDIR}$(HELPER_PATH)
+	install -m 644 README.dislocator $${DESTDIR}$(HELPER_PATH)
 
diff --git a/libdislocator/libdislocator.so.c b/libdislocator/libdislocator.so.c
@@ -185,7 +185,7 @@ void* malloc(size_t len) {
 
 
 /* The wrapper for free(). This simply marks the entire region as PROT_NONE.
-   If the region is already freed, the code segfault during the attempt to
+   If the region is already freed, the code will segfault during the attempt to
    read the canary. Not very graceful, but works, right? */
 
 void free(void* ptr) {

diff --git a/libtokencap/Makefile b/libtokencap/Makefile
@@ -34,4 +34,5 @@ clean:
 
 install: all
 	install -m 755 libtokencap.so $${DESTDIR}$(HELPER_PATH)
+	install -m 644 README.tokencap $${DESTDIR}$(HELPER_PATH)
 
diff --git a/libtokencap/README.tokencap b/libtokencap/README.tokencap
@@ -5,21 +5,22 @@ strcmp() / memcmp() token capture library
   (See ../docs/README for the general instruction manual.)
 
 This Linux-only companion library allows you to instrument strcmp(), memcmp(),
-and related functions to automatically extract syntax tokens that happen to be
-scanned for using these functions. The resulting list may be then passed as a
-dictionary to afl-fuzz (the -x option) to improve coverage on subsequent 
+and related functions to automatically extract syntax tokens passed to any of
+these libcalls. The resulting list of tokens may be then given as a starting
+dictionary to afl-fuzz (the -x option) to improve coverage on subsequent
 fuzzing runs.
 
 This may help improving coverage in some targets, and do precisely nothing in
-others. In some cases, it may even make things worse if the library picks up
-syntax tokens that are not used to process the input data, but that showed up
-as a result of parsing a config file or other unrelated stuff; a dictionary
-with junk tokens will simply waste a ton of CPU time. In other words, use this
-with care.
+others. In some cases, it may even make things worse: if libtokencap picks up
+syntax tokens that are not used to process the input data, but that are a part
+of - say - parsing a config file... well, you're going to end up wasting a lot
+of CPU time on trying them out in the input stream. In other words, use this
+feature with care. Manually screening the resulting dictionary is almost
+always a necessity.
 
-The library prints tokens, without any deduping, by appending them to a file
-specified via AFL_TOKEN_FILE. If the variable is not set, the tool uses stderr
-(which is probably not what you want).
+As for the actual operation: the library stores tokens, without any deduping,
+by appending them to a file specified via AFL_TOKEN_FILE. If the variable is not
+set, the tool uses stderr (which is probably not what you want).
 
 Similarly to afl-tmin, the library is not "proprietary" and can be used with
 other fuzzers or testing tools without the need for any code tweaks. It does not
@@ -36,7 +37,7 @@ when using afl-gcc. This setting specifically adds the following flags:
 The next step is simply loading this library via LD_PRELOAD. The optimal usage
 pattern is to allow afl-fuzz to fuzz normally for a while and build up a corpus,
 and then fire off the target binary, with libtokencap.so loaded, on every file
-found by AFL in that earlier run. Here's the basic idea:
+found by AFL in that earlier run. This demonstrates the basic principle:
 
   export AFL_TOKEN_FILE=$PWD/temp_output.txt
 
@@ -54,4 +55,6 @@ the whole thing isn't dynamically linked, and LD_PRELOAD is having no effect.
 PS. The library is Linux-only because there is probably no particularly portable
 and non-invasive way to distinguish between read-only and read-write memory
 mappings. The __tokencap_load_mappings() function is the only thing that would
-need to be changed for other OSes.
+need to be changed for other OSes. Porting to platforms with /proc/<pid>/maps
+(e.g., FreeBSD) should be trivial.
+
Original file line number	Diff line number	Diff line change
Expand Up		@@ -34,4 +34,5 @@ clean:

		install: all
		install -m 755 libdislocator.so $${DESTDIR}$(HELPER_PATH)
		install -m 644 README.dislocator $${DESTDIR}$(HELPER_PATH)