diff --git a/afl-fuzz.c b/afl-fuzz.c
index 0feb000f..4a863020 100644
--- a/afl-fuzz.c
+++ b/afl-fuzz.c
@@ -7961,7 +7961,7 @@ int main(int argc, char** argv) {
 stop_fuzzing:
 
   SAYF(CURSOR_SHOW cLRD "\n\n+++ Testing aborted %s +++\n" cRST,
-       stop_soon == 2 ? "programatically" : "by user");
+       stop_soon == 2 ? "programmatically" : "by user");
 
   /* Running for more than 30 minutes but still doing first cycle? */
 
diff --git a/config.h b/config.h
index 39313450..c24627ff 100644
--- a/config.h
+++ b/config.h
@@ -21,7 +21,7 @@
 
 /* Version string: */
 
-#define VERSION             "2.28b"
+#define VERSION             "2.29b"
 
 /******************************************************
  *                                                    *
diff --git a/docs/ChangeLog b/docs/ChangeLog
index 110c864b..f38b19a8 100644
--- a/docs/ChangeLog
+++ b/docs/ChangeLog
@@ -16,6 +16,14 @@ Not sure if you should upgrade? The lowest currently recommended version
 is 2.23b. If you're stuck on an earlier release, it's strongly advisable
 to get on with the times.
 
+--------------
+Version 2.29b:
+--------------
+
+  - Made a minor #include fix to llvm_mode. Suggested by Jonathan Metzman.
+
+  - Made cosmetic updates to the docs.
+
 --------------
 Version 2.28b:
 --------------
diff --git a/docs/QuickStartGuide.txt b/docs/QuickStartGuide.txt
index 61541466..abe7032f 100644
--- a/docs/QuickStartGuide.txt
+++ b/docs/QuickStartGuide.txt
@@ -27,7 +27,7 @@ how to hit the ground running:
 
 4) Get a small but valid input file that makes sense to the program. When
    fuzzing verbose syntax (SQL, HTTP, etc), create a dictionary as described in
-   testcases/README.testcases, too.
+   dictionaries/README.dictionaries, too.
 
 5) If the program reads from stdin, run 'afl-fuzz' like so:
 
diff --git a/docs/life_pro_tips.txt b/docs/life_pro_tips.txt
index e97d32dc..df053a8e 100644
--- a/docs/life_pro_tips.txt
+++ b/docs/life_pro_tips.txt
@@ -33,7 +33,7 @@ Run the bundled afl-plot utility to generate browser-friendly graphs.
 
 %
 
-Need to monitor AFL jobs programatically? Check out the fuzzer_stats file
+Need to monitor AFL jobs programmatically? Check out the fuzzer_stats file
 in the AFL output dir or try afl-whatsup.
 
 %
@@ -62,11 +62,6 @@ Try the bundled afl-tmin tool - and get small repro files fast!
 
 %
 
-Need to fix a checksum? It's easy to do with an output postprocessor!
-See experimental/post_library to learn more.
-
-%
-
 Not sure if a crash is exploitable? AFL can help you figure it out. Specify
 -C to enable the peruvian were-rabbit mode. See section #10 in README for more.
 
@@ -122,7 +117,7 @@ You can find a simple solution in experimental/argv_fuzzing.
 
 %
 
-Attacking a format that uses checksums? Remove the checksum code or
+Attacking a format that uses checksums? Remove the checksum-checking code or
 use a postprocessor! See experimental/post_library/ for more.
 
 %
diff --git a/docs/technical_details.txt b/docs/technical_details.txt
index af037355..3ec48741 100644
--- a/docs/technical_details.txt
+++ b/docs/technical_details.txt
@@ -85,8 +85,8 @@ of dword- or qword-wide instructions and a simple loop.
 When a mutated input produces an execution trace containing new tuples, the
 corresponding input file is preserved and routed for additional processing
 later on (see section #3). Inputs that do not trigger new local-scale state
-transitions in the execution trace are discarded, even if their overall
-instrumentation output pattern is unique.
+transitions in the execution trace (i.e., produce no new tuples) are discarded,
+even if their overall control flow sequence is unique.
 
 This approach allows for a very fine-grained and long-term exploration of
 program state while not having to perform any computationally intensive and
@@ -101,7 +101,7 @@ new tuples (CA, AE):
   #2: A -> B -> C -> A -> E
 
 At the same time, with #2 processed, the following pattern will not be seen
-as unique, despite having a markedly different execution path:
+as unique, despite having a markedly different overall execution path:
 
   #3: A -> B -> C -> A -> B -> C -> A -> B -> C -> D -> E
 
@@ -142,9 +142,9 @@ Mutated test cases that produced new state transitions within the program are
 added to the input queue and used as a starting point for future rounds of
 fuzzing. They supplement, but do not automatically replace, existing finds.
 
-This approach allows the tool to progressively explore various disjoint and
-possibly mutually incompatible features of the underlying data format, as
-shown in this image:
+In contrast to more greedy genetic algorithms, this approach allows the tool
+to progressively explore various disjoint and possibly mutually incompatible
+features of the underlying data format, as shown in this image:
 
   http://lcamtuf.coredump.cx/afl/afl_gzip.png
 
@@ -201,10 +201,10 @@ the sessions were seeded with a valid unified diff:
       Edge coverage | 1,259   | 1,734   | 1.72     | 0
           AFL model | 1,452   | 2,040   | 3.16     | 1
 
-Some of the earlier work on evolutionary fuzzing suggested maintaining just a
-single test case and selecting for mutations that improve coverage. At least
-in the tests described above, this "greedy" method appeared to offer no
-substantial benefits over blind fuzzing.
+At noted earlier on, some of the prior work on genetic fuzzing relied on
+maintaining a single test case and evolving it to maximize coverage. At least
+in the tests described above, this "greedy" approach appears to confer no
+substantial benefits over blind fuzzing strategies.
 
 4) Culling the corpus
 ---------------------
@@ -263,9 +263,9 @@ files make the target binary slower, and because they reduce the likelihood
 that a mutation would touch important format control structures, rather than
 redundant data blocks. This is discussed in more detail in perf_tips.txt.
 
-The possibility of a bad starting corpus provided by the user aside, some
-types of mutations can have the effect of iteratively increasing the size of
-the generated files, so it is important to counter this trend.
+The possibility that the user will provide a low-quality starting corpus aside,
+some types of mutations can have the effect of iteratively increasing the size
+of the generated files, so it is important to counter this trend.
 
 Luckily, the instrumentation feedback provides a simple way to automatically
 trim down input files while ensuring that the changes made to the files have no
@@ -275,8 +275,8 @@ The built-in trimmer in afl-fuzz attempts to sequentially remove blocks of data
 with variable length and stepover; any deletion that doesn't affect the checksum
 of the trace map is committed to disk. The trimmer is not designed to be
 particularly thorough; instead, it tries to strike a balance between precision
-and the number of execve() calls spent on the process. The average per-file
-gains are around 5-20%.
+and the number of execve() calls spent on the process, selecting the block size
+and stepover to match. The average per-file gains are around 5-20%.
 
 The standalone afl-tmin tool uses a more exhaustive, iterative algorithm, and
 also attempts to perform alphabet normalization on the trimmed files. 
@@ -302,11 +302,16 @@ strategies include:
 
   - Sequential insertion of known interesting integers (0, 1, INT_MAX, etc),
 
-The non-deterministic steps include stacked bit flips, insertions, deletions,
-arithmetics, and splicing of different test cases.
+The purpose of opening with deterministic steps is related to their tendency to
+produce compact test cases and small diffs between the non-crashing and crashing
+inputs.
 
-Their relative yields and execve() costs have been investigated and are
-discussed in the aforementioned blog post.
+With deterministic fuzzing out of the way, the non-deterministic steps include
+stacked bit flips, insertions, deletions, arithmetics, and splicing of different
+test cases.
+
+The relative yields and execve() costs of all these strategies have been
+investigated and are discussed in the aforementioned blog post.
 
 For the reasons discussed in historical_notes.txt (chiefly, performance,
 simplicity, and reliability), AFL generally does not try to reason about the
@@ -315,13 +320,14 @@ are nominally blind, and are guided only by the evolutionary design of the
 input queue.
 
 That said, there is one (trivial) exception to this rule: when a new queue
-entry goes through the initial set of deterministic fuzzing steps, and some
-regions in the file are observed to have no effect on the checksum of the
+entry goes through the initial set of deterministic fuzzing steps, and tweaks to
+some regions in the file are observed to have no effect on the checksum of the
 execution path, they may be excluded from the remaining phases of
-deterministic fuzzing - and proceed straight to random tweaks. Especially for
-verbose, human-readable data formats, this can reduce the number of execs by
-10-40% or so without an appreciable drop in coverage. In extreme cases, such
-as normally block-aligned tar archives, the gains can be as high as 90%.
+deterministic fuzzing - and the fuzzer may proceed straight to random tweaks.
+Especially for verbose, human-readable data formats, this can reduce the number
+of execs by 10-40% or so without an appreciable drop in coverage. In extreme
+cases, such as normally block-aligned tar archives, the gains can be as high as
+90%.
 
 Because the underlying "effector maps" are local every queue entry and remain
 in force only during deterministic stages that do not alter the size or the
@@ -353,6 +359,14 @@ the grammar of highly verbose and complex languages such as JavaScript, SQL,
 or XML; several examples of generated SQL statements are given in the blog
 post mentioned above.
 
+Interestingly, the AFL instrumentation also allows the fuzzer to automatically
+isolate syntax tokens already present in an input file. It can do so by looking
+for run of bytes that, when flipped, produce a consistent change to the
+program's execution path; this is suggestive of an underlying atomic comparison
+to a predefined value baked into the code. The fuzzer relies on this signal
+to build compact "auto dictionaries" that are then used in conjunction with
+other fuzzing strategies.
+
 8) De-duping crashes
 --------------------
 
@@ -416,22 +430,25 @@ With fast targets, the fork server can offer considerable performance gains,
 usually between 1.5x and 2x. It is also possible to:
 
   - Use the fork server in manual ("deferred") mode, skipping over larger,
-    user-selected chunks of initialization code. With some targets, this can
+    user-selected chunks of initialization code. It requires very modest
+    code changes to the targeted program, and With some targets, can
     produce 10x+ performance gains.
 
   - Enable "persistent" mode, where a single process is used to try out
     multiple inputs, greatly limiting the overhead of repetitive fork()
-    calls. As with the previous mode, this requires custom modifications,
+    calls. This generally requires some code changes to the targeted program,
     but can improve the performance of fast targets by a factor of 5 or more
-    - approximating the benefits of in-process fuzzing jobs.
+    - approximating the benefits of in-process fuzzing jobs while still
+    maintaining very robust isolation between the fuzzer process and the
+    targeted binary.
 
 11) Parallelization
 -------------------
 
 The parallelization mechanism relies on periodically examining the queues
 produced by independently-running instances on other CPU cores or on remote
-machines, and then selectively pulling in the test cases that produce behaviors
-not yet seen by the fuzzer at hand.
+machines, and then selectively pulling in the test cases that, when tried
+out locally, produce behaviors not yet seen by the fuzzer at hand.
 
 This allows for extreme flexibility in fuzzer setup, including running synced
 instances against different parsers of a common data format, often with
diff --git a/llvm_mode/afl-llvm-rt.o.c b/llvm_mode/afl-llvm-rt.o.c
index 62768d6f..5ac8861b 100644
--- a/llvm_mode/afl-llvm-rt.o.c
+++ b/llvm_mode/afl-llvm-rt.o.c
@@ -26,6 +26,7 @@
 #include <stdlib.h>
 #include <signal.h>
 #include <unistd.h>
+#include <string.h>
 #include <assert.h>
 
 #include <sys/mman.h>