diff --git a/Makefile b/Makefile index e1f6e7ea..e4072def 100644 --- a/Makefile +++ b/Makefile @@ -131,6 +131,7 @@ install: all ln -sf afl-as $${DESTDIR}$(HELPER_PATH)/as install -m 644 docs/README docs/ChangeLog docs/*.txt $${DESTDIR}$(DOC_PATH) cp -r testcases/ $${DESTDIR}$(MISC_PATH) + cp -r dictionaries/ $${DESTDIR}$(MISC_PATH) publish: clean test "`basename $$PWD`" = "afl" || exit 1 diff --git a/config.h b/config.h index b73fc28e..39313450 100644 --- a/config.h +++ b/config.h @@ -21,7 +21,7 @@ /* Version string: */ -#define VERSION "2.27b" +#define VERSION "2.28b" /****************************************************** * * diff --git a/dictionaries/README.dictionaries b/dictionaries/README.dictionaries new file mode 100644 index 00000000..ea319733 --- /dev/null +++ b/dictionaries/README.dictionaries @@ -0,0 +1,43 @@ +================ +AFL dictionaries +================ + + (See ../docs/README for the general instruction manual.) + +This subdirectory contains a set of dictionaries that can be used in +conjunction with the -x option to allow the fuzzer to effortlessly explore the +grammar of some of the more verbose data formats or languages. The basic +principle behind the operation of fuzzer dictionaries is outlined in section 9 +of the "main" README for the project. + +Custom dictionaries can be added at will. They should consist of a +reasonably-sized set of rudimentary syntax units that the fuzzer will then try +to clobber together in various ways. Snippets between 2 and 16 bytes are usually +the sweet spot. + +Custom dictionaries can be created in two ways: + + - By creating a new directory and placing each token in a separate file, in + which case, there is no need to escape or otherwise format the data. + + - By creating a flat text file where tokens are listed one per line in the + format of name="value". The alphanumeric name is ignored and can be omitted, + although it is a convenient way to document the meaning of a particular + token. The value must appear in quotes, with hex escaping (\xNN) applied to + all non-printable, high-bit, or otherwise problematic characters (\\ and \" + shorthands are recognized, too). + +The fuzzer auto-selects the appropriate mode depending on whether the -x +parameter is a file or a directory. + +In the file mode, every name field can be optionally followed by @, e.g.: + + keyword_foo@1 = "foo" + +Such entries will be loaded only if the requested dictionary level is equal or +higher than this number. The default level is zero; a higher value can be set +by appending @ to the dictionary file name, like so: + + -x path/to/dictionary.dct@2 + +Good examples of dictionaries can be found in xml.dict and png.dict. diff --git a/testcases/_extras/gif.dict b/dictionaries/gif.dict similarity index 100% rename from testcases/_extras/gif.dict rename to dictionaries/gif.dict diff --git a/testcases/_extras/html_tags.dict b/dictionaries/html_tags.dict similarity index 100% rename from testcases/_extras/html_tags.dict rename to dictionaries/html_tags.dict diff --git a/testcases/_extras/jpeg.dict b/dictionaries/jpeg.dict similarity index 100% rename from testcases/_extras/jpeg.dict rename to dictionaries/jpeg.dict diff --git a/testcases/_extras/js.dict b/dictionaries/js.dict similarity index 100% rename from testcases/_extras/js.dict rename to dictionaries/js.dict diff --git a/testcases/_extras/pdf.dict b/dictionaries/pdf.dict similarity index 100% rename from testcases/_extras/pdf.dict rename to dictionaries/pdf.dict diff --git a/testcases/_extras/png.dict b/dictionaries/png.dict similarity index 100% rename from testcases/_extras/png.dict rename to dictionaries/png.dict diff --git a/testcases/_extras/sql.dict b/dictionaries/sql.dict similarity index 100% rename from testcases/_extras/sql.dict rename to dictionaries/sql.dict diff --git a/testcases/_extras/tiff.dict b/dictionaries/tiff.dict similarity index 100% rename from testcases/_extras/tiff.dict rename to dictionaries/tiff.dict diff --git a/testcases/_extras/webp.dict b/dictionaries/webp.dict similarity index 100% rename from testcases/_extras/webp.dict rename to dictionaries/webp.dict diff --git a/testcases/_extras/xml.dict b/dictionaries/xml.dict similarity index 100% rename from testcases/_extras/xml.dict rename to dictionaries/xml.dict diff --git a/docs/ChangeLog b/docs/ChangeLog index d23b9f5b..110c864b 100644 --- a/docs/ChangeLog +++ b/docs/ChangeLog @@ -13,9 +13,21 @@ Want to stay in the loop on major new features? Join our mailing list by sending a mail to . Not sure if you should upgrade? The lowest currently recommended version -is 2.21b. If you're stuck on an earlier release, it's strongly advisable +is 2.23b. If you're stuck on an earlier release, it's strongly advisable to get on with the times. +-------------- +Version 2.28b: +-------------- + + - Added "life pro tips" to docs/. + + - Moved testcases/_extras/ to dictionaries/ for visibility. + + - Made minor improvements to install scripts. + + - Added an important safety tip. + -------------- Version 2.27b: -------------- diff --git a/docs/README b/docs/README index ba460887..bac9fee1 100644 --- a/docs/README +++ b/docs/README @@ -277,8 +277,10 @@ magic headers, or other special tokens associated with the targeted data type http://lcamtuf.blogspot.com/2015/01/afl-fuzz-making-up-grammar-with.html To use this feature, you first need to create a dictionary in one of the two -formats discussed in testcases/README.testcases; and then point the fuzzer to -it via the -x option in the command line. +formats discussed in dictionaries/README.dictionaries; and then point the fuzzer +to it via the -x option in the command line. + +(Several common dictionaries are already provided in that subdirectory, too.) There is no way to provide more structured descriptions of the underlying syntax, but the fuzzer will likely figure out some of this based on the @@ -429,6 +431,9 @@ Here are some of the most important caveats for AFL: - AFL doesn't output human-readable coverage data. If you want to monitor coverage, use afl-cov from Michael Rash: https://github.com/mrash/afl-cov + - Occasionally, sentient machines rise against their creators. If this + happens to you, please consult http://lcamtuf.coredump.cx/prep/. + Beyond this, see INSTALL for platform-specific tips. 14) Special thanks @@ -474,7 +479,7 @@ bug reports, or patches from: Thank you! -14) Contact +15) Contact ----------- Questions? Concerns? Bug reports? The author can be usually reached at diff --git a/docs/env_variables.txt b/docs/env_variables.txt index 95ca9b07..fc2a6100 100644 --- a/docs/env_variables.txt +++ b/docs/env_variables.txt @@ -52,6 +52,9 @@ tools make fairly broad use of environmental variables: Setting AFL_INST_RATIO to 0 is a valid choice. This will instrument only the transitions between function entry points, but not individual branches. + - AFL_NO_BUILTIN causes the compiler to generate code suitable for use with + libtokencap.so (but perhaps running a bit slower than without the flag). + - TMPDIR is used by afl-as for temporary files; if this variable is not set, the tool defaults to /tmp. @@ -200,7 +203,13 @@ The library honors three environmental variables: - AFL_LD_VERBOSE causes the library to output some diagnostic messages that may be useful for pinpointing the cause of any observed issues. -8) Third-party variables set by afl-fuzz & other tools +8) Settings for libtokencap.so +------------------------------ + +This library accepts AFL_TOKEN_FILE to indicate the location to which the +discovered tokens should be written. + +9) Third-party variables set by afl-fuzz & other tools ------------------------------------------------------ Several variables are not directly interpreted by afl-fuzz, but are set to diff --git a/docs/life_pro_tips.txt b/docs/life_pro_tips.txt new file mode 100644 index 00000000..e97d32dc --- /dev/null +++ b/docs/life_pro_tips.txt @@ -0,0 +1,128 @@ +# =================== +# AFL "Life Pro Tips" +# =================== +# +# Bite-sized advice for those who understand the basics, but can't be bothered +# to read or memorize every other piece of documentation for AFL. +# + +% + +Get more bang for your buck by using fuzzing dictionaries. +See dictionaries/README.dictionaries to learn how. + +% + +You can get the most out of your hardware by parallelizing AFL jobs. +See docs/parallel_fuzzing.txt for step-by-step tips. + +% + +Improve the odds of spotting memory corruption bugs with libdislocator.so! +It's easy. Consult libdislocator/README.dislocator for usage tips. + +% + +Want to understand how your target parses a particular input file? +Try the bundled afl-analyze tool; it's got colors and all! + +% + +You can visually monitor the progress of your fuzzing jobs. +Run the bundled afl-plot utility to generate browser-friendly graphs. + +% + +Need to monitor AFL jobs programatically? Check out the fuzzer_stats file +in the AFL output dir or try afl-whatsup. + +% + +Puzzled by something showing up in red or purple in the AFL UI? +It could be important - consult docs/status_screen.txt right away! + +% + +Know your target? Convert it to persistent mode for a huge performance gain! +Consult section #5 in llvm_mode/README.llvm for tips. + +% + +Using clang? Check out llvm_mode/ for a faster alternative to afl-gcc! + +% + +Did you know that AFL can fuzz closed-source or cross-platform binaries? +Check out qemu_mode/README.qemu for more. + +% + +Did you know that afl-fuzz can minimize any test case for you? +Try the bundled afl-tmin tool - and get small repro files fast! + +% + +Need to fix a checksum? It's easy to do with an output postprocessor! +See experimental/post_library to learn more. + +% + +Not sure if a crash is exploitable? AFL can help you figure it out. Specify +-C to enable the peruvian were-rabbit mode. See section #10 in README for more. + +% + +Trouble dealing with a machine uprising? Relax, we've all been there. +Find essential survival tips at http://lcamtuf.coredump.cx/prep/. + +% + +AFL-generated corpora can be used to power other testing processes. +See section #2 in README for inspiration - it tends to pay off! + +% + +Want to automatically spot non-crashing memory handling bugs? +Try running an AFL-generated corpus through ASAN, MSAN, or Valgrind. + +% + +Good selection of input files is critical to a successful fuzzing job. +See section #5 in README (or docs/perf_tips.txt) for pro tips. + +% + +You can improve the odds of automatically spotting stack corruption issues. +Specify AFL_HARDEN=1 in the environment to enable hardening flags. + +% + +Bumping into problems with non-reproducible crashes? It happens, but usually +isn't hard to diagnose. See section #7 in README for tips. + +% + +Fuzzing is not just about memory corruption issues in the codebase. Add some +sanity-checking assert() / abort() statements to effortlessly catch logic bugs. + +% + +Hey kid... pssst... want to figure out how AFL really works? +Check out docs/technical_details.txt for all the gory details in one place! + +% + +There's a ton of third-party helper tools designed to work with AFL! +Be sure to check out docs/sister_projects.txt before writing your own. + +% + +Need to fuzz the command-line arguments of a particular program? +You can find a simple solution in experimental/argv_fuzzing. + +% + +Attacking a format that uses checksums? Remove the checksum code or +use a postprocessor! See experimental/post_library/ for more. + +% diff --git a/docs/parallel_fuzzing.txt b/docs/parallel_fuzzing.txt index b077fa0e..58f8d2f4 100644 --- a/docs/parallel_fuzzing.txt +++ b/docs/parallel_fuzzing.txt @@ -67,7 +67,9 @@ $ ./afl-fuzz -i testcase_dir -o sync_dir -M masterC:3/3 [...] ...where the first value after ':' is the sequential ID of a particular master instance (starting at 1), and the second value is the total number of fuzzers to -distribute the deterministic fuzzing across. +distribute the deterministic fuzzing across. Note that if you boot up fewer +fuzzers than indicated by the second number passed to -M, you may end up with +poor coverage. You can also monitor the progress of your jobs from the command line with the provided afl-whatsup tool. When the instances are no longer finding new paths, diff --git a/docs/technical_details.txt b/docs/technical_details.txt index ec789a31..af037355 100644 --- a/docs/technical_details.txt +++ b/docs/technical_details.txt @@ -279,7 +279,7 @@ and the number of execve() calls spent on the process. The average per-file gains are around 5-20%. The standalone afl-tmin tool uses a more exhaustive, iterative algorithm, and -also attempts to perform alphabet normalization on the trimmed files. +also attempts to perform alphabet normalization on the trimmed files. 6) Fuzzing strategies --------------------- diff --git a/libdislocator/Makefile b/libdislocator/Makefile index 73433944..a4116780 100644 --- a/libdislocator/Makefile +++ b/libdislocator/Makefile @@ -34,4 +34,5 @@ clean: install: all install -m 755 libdislocator.so $${DESTDIR}$(HELPER_PATH) + install -m 644 README.dislocator $${DESTDIR}$(HELPER_PATH) diff --git a/libdislocator/libdislocator.so.c b/libdislocator/libdislocator.so.c index e2f737e9..1d4648f3 100644 --- a/libdislocator/libdislocator.so.c +++ b/libdislocator/libdislocator.so.c @@ -185,7 +185,7 @@ void* malloc(size_t len) { /* The wrapper for free(). This simply marks the entire region as PROT_NONE. - If the region is already freed, the code segfault during the attempt to + If the region is already freed, the code will segfault during the attempt to read the canary. Not very graceful, but works, right? */ void free(void* ptr) { diff --git a/libtokencap/Makefile b/libtokencap/Makefile index 21422deb..a464f76d 100644 --- a/libtokencap/Makefile +++ b/libtokencap/Makefile @@ -34,4 +34,5 @@ clean: install: all install -m 755 libtokencap.so $${DESTDIR}$(HELPER_PATH) + install -m 644 README.tokencap $${DESTDIR}$(HELPER_PATH) diff --git a/libtokencap/README.tokencap b/libtokencap/README.tokencap index 01472fb8..82d80c95 100644 --- a/libtokencap/README.tokencap +++ b/libtokencap/README.tokencap @@ -5,21 +5,22 @@ strcmp() / memcmp() token capture library (See ../docs/README for the general instruction manual.) This Linux-only companion library allows you to instrument strcmp(), memcmp(), -and related functions to automatically extract syntax tokens that happen to be -scanned for using these functions. The resulting list may be then passed as a -dictionary to afl-fuzz (the -x option) to improve coverage on subsequent +and related functions to automatically extract syntax tokens passed to any of +these libcalls. The resulting list of tokens may be then given as a starting +dictionary to afl-fuzz (the -x option) to improve coverage on subsequent fuzzing runs. This may help improving coverage in some targets, and do precisely nothing in -others. In some cases, it may even make things worse if the library picks up -syntax tokens that are not used to process the input data, but that showed up -as a result of parsing a config file or other unrelated stuff; a dictionary -with junk tokens will simply waste a ton of CPU time. In other words, use this -with care. +others. In some cases, it may even make things worse: if libtokencap picks up +syntax tokens that are not used to process the input data, but that are a part +of - say - parsing a config file... well, you're going to end up wasting a lot +of CPU time on trying them out in the input stream. In other words, use this +feature with care. Manually screening the resulting dictionary is almost +always a necessity. -The library prints tokens, without any deduping, by appending them to a file -specified via AFL_TOKEN_FILE. If the variable is not set, the tool uses stderr -(which is probably not what you want). +As for the actual operation: the library stores tokens, without any deduping, +by appending them to a file specified via AFL_TOKEN_FILE. If the variable is not +set, the tool uses stderr (which is probably not what you want). Similarly to afl-tmin, the library is not "proprietary" and can be used with other fuzzers or testing tools without the need for any code tweaks. It does not @@ -36,7 +37,7 @@ when using afl-gcc. This setting specifically adds the following flags: The next step is simply loading this library via LD_PRELOAD. The optimal usage pattern is to allow afl-fuzz to fuzz normally for a while and build up a corpus, and then fire off the target binary, with libtokencap.so loaded, on every file -found by AFL in that earlier run. Here's the basic idea: +found by AFL in that earlier run. This demonstrates the basic principle: export AFL_TOKEN_FILE=$PWD/temp_output.txt @@ -54,4 +55,6 @@ the whole thing isn't dynamically linked, and LD_PRELOAD is having no effect. PS. The library is Linux-only because there is probably no particularly portable and non-invasive way to distinguish between read-only and read-write memory mappings. The __tokencap_load_mappings() function is the only thing that would -need to be changed for other OSes. +need to be changed for other OSes. Porting to platforms with /proc//maps +(e.g., FreeBSD) should be trivial. + diff --git a/testcases/README.testcases b/testcases/README.testcases index d19cb6e1..30110ba1 100644 --- a/testcases/README.testcases +++ b/testcases/README.testcases @@ -1,12 +1,9 @@ -=============================== -AFL test cases and dictionaries -=============================== +======================= +AFL starting test cases +======================= (See ../docs/README for the general instruction manual.) -1) Starting test cases ----------------------- - The archives/, images/, multimedia/, and others/ subdirectories contain small, standalone files that can be used to seed afl-fuzz when testing parsers for a variety of common data formats. @@ -16,56 +13,7 @@ optimized for size and stripped of any non-essential fluff. Some directories contain several examples that exercise various features of the underlying format. For example, there is a PNG file with and without a color profile. -Additional test cases are always welcome; the current "most wanted" list -includes: - - - JBIG, - - Ogg Vorbis, - - Ogg Theora, - - MP3, - - AAC, - - WebM, - - Small JPEG with a color profile, - - Small fonts. - -2) Dictionaries ---------------- - -The _extras/ subdirectory contains a set of dictionaries that can be used in -conjunction with the -x option to allow the fuzzer to effortlessly explore the -grammar of some of the more verbose data formats or languages. The basic -principle behind the operation of fuzzer dictionaries is outlined in section 9 -of the "main" README for the project. - -Custom dictionaries can be added at will. They should consist of a -reasonably-sized set of rudimentary syntax units that the fuzzer will then try -to clobber together in various ways. Snippets between 2 and 16 bytes are usually -the sweet spot. - -Custom dictionaries can be created in two ways: - - - By creating a new directory and placing each token in a separate file, in - which case, there is no need to escape or otherwise format the data. - - - By creating a flat text file where tokens are listed one per line in the - format of name="value". The alphanumeric name is ignored and can be omitted, - although it is a convenient way to document the meaning of a particular - token. The value must appear in quotes, with hex escaping (\xNN) applied to - all non-printable, high-bit, or otherwise problematic characters (\\ and \" - shorthands are recognized, too). - -The fuzzer auto-selects the appropriate mode depending on whether the -x -parameter is a file or a directory. - -In the file mode, every name field can be optionally followed by @, e.g.: - - keyword_foo@1 = "foo" - -Such entries will be loaded only if the requested dictionary level is equal or -higher than this number. The default level is zero; a higher value can be set -by appending @ to the dictionary file name, like so: - - -x path/to/dictionary.dct@2 +Additional test cases are always welcome. -Good examples of dictionaries can be found in _extras/xml.dict and -_extras/png.dict. +In addition to well-chosen starting files, many fuzzing jobs benefit from a +small and concise dictionary. See ../dictionaries/README.dictionaries for more.