Skip to content

Commit

Permalink
Add Annotating Docs (#8470)
Browse files Browse the repository at this point in the history
* `quick_start.md`: Add quick start guide

* `annotation.md`: Add section on annotating flatbuffers
  • Loading branch information
dbaileychess authored Dec 27, 2024
1 parent 0f90dc8 commit 2d86857
Show file tree
Hide file tree
Showing 2 changed files with 154 additions and 0 deletions.
2 changes: 2 additions & 0 deletions docs/mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -65,4 +65,6 @@ nav:
- Overview: "schema.md"
- Evolution: "evolution.md"
- Grammar: "grammar.md"
- Advanced:
- Annotating Buffers (.afb): "annotation.md"
- Contributing: "contributing.md"
152 changes: 152 additions & 0 deletions docs/source/annotation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
# Annotating FlatBuffers

This provides a way to annotate flatbuffer binary data, byte-by-byte, with a
schema. It is useful for development purposes and understanding the details of
the internal format.

## Annotating

Given a `schema`, as either a plain-text (`.fbs`) or a binary schema (`.bfbs`),
and `binary` file(s) that were created by the `schema`. You can annotate them
using:

```sh
flatc --annotate SCHEMA -- BINARY_FILES...
```

This will produce a set of annotated files (`.afb` Annotated FlatBuffer)
corresponding to the input binary files.

### Example

Taken from the [tests/annotated_binary](https://github.com/google/flatbuffers/tree/master/tests/annotated_binary).

```sh
cd tests/annotated_binary
../../flatc --annotate annotated_binary.fbs -- annotated_binary.bin
```

Which will produce a `annotated_binary.afb` file in the current directory.


!!! Tip

The `annotated_binary.bin` is the flatbufer binary of the data contained
within `annotated_binary.json`, which was made by the following command:

```sh
..\..\flatc -b annotated_binary.fbs annotated_binary.json
```

## .afb Text Format

Currently there is a built-in text-based format for outputting the annotations.
A full example is shown here:
[`annotated_binary.afb`](https://github.com/google/flatbuffers/blob/master/tests/annotated_binary/annotated_binary.afb)

The data is organized as a table with fixed [columns](#columns) grouped into
Binary [sections](#binary-sections) and [regions](#binary-regions), starting
from the beginning of the binary (offset `0`).

### Columns

The columns are as follows:

1. The offset from the start of the binary, expressed in hexadecimal format
(e.g. `+0x003c`).

The prefix `+` is added to make searching for the offset (compared to some
random value) a bit easier.

2. The raw binary data, expressed in hexadecimal format.

This is in the little endian format the buffer uses internally and what you
would see with a normal binary text viewer.

3. The type of the data.

This may be the type specified in the schema or some internally defined
types:


| Internal Type | Purpose |
|---------------|----------------------------------------------------|
| `VOffset16` | Virtual table offset, relative to the table offset |
| `UOffset32` | Unsigned offset, relative to the current offset |
| `SOffset32` | Signed offset, relative to the current offset |


4. The value of the data.

This is shown in big endian format that is generally written for humans to
consume (e.g. `0x0013`). As well as the "casted" value (e.g. `0x0013 `is
`19` in decimal) in parentheses.

5. Notes about the particular data.

This describes what the data is about, either some internal usage, or tied
to the schema.

### Binary Sections

The file is broken up into Binary Sections, which are comprised of contiguous
[binary regions](#binary-regions) that are logically grouped together. For
example, a binary section may be a single instance of a flatbuffer `Table` or
its `vtable`. The sections may be labelled with the name of the associated type,
as defined in the input schema.

An example of a `vtable` Binary Section that is associated with the user-defined
`AnnotateBinary.Bar` table.

```
vtable (AnnotatedBinary.Bar):
+0x00A0 | 08 00 | uint16_t | 0x0008 (8) | size of this vtable
+0x00A2 | 13 00 | uint16_t | 0x0013 (19) | size of referring table
+0x00A4 | 08 00 | VOffset16 | 0x0008 (8) | offset to field `a` (id: 0)
+0x00A6 | 04 00 | VOffset16 | 0x0004 (4) | offset to field `b` (id: 1)
```

These are purely annotative, there is no embedded information about these
regions in the flatbuffer itself.

### Binary Regions

Binary regions are contiguous bytes regions that are grouped together to form
some sort of value, e.g. a `scalar` or an array of scalars. A binary region may
be split up over multiple text lines, if the size of the region is large.

#### Annotation Example

Looking at an example binary region:

```
vtable (AnnotatedBinary.Bar):
+0x00A0 | 08 00 | uint16_t | 0x0008 (8) | size of this vtable
```

The first column (`+0x00A0`) is the offset to this region from the beginning of
the buffer.

The second column are the raw bytes (hexadecimal) that make up this region.
These are expressed in the little-endian format that flatbuffers uses for the
wire format.

The third column is the type to interpret the bytes as. For the above example,
the type is `uint16_t` which is a 16-bit unsigned integer type.

The fourth column shows the raw bytes as a compacted, big-endian value. The raw
bytes are duplicated in this fashion since it is more intuitive to read the data
in the big-endian format (e.g., `0x0008`). This value is followed by the decimal
representation of the value (e.g., `(8)`). For strings, the raw string value is
shown instead.

The fifth column is a textual comment on what the value is. As much metadata as
known is provided.

### Offsets

If the type in the 3rd column is of an absolute offset (`SOffet32` or
`Offset32`), the fourth column also shows an `Loc: +0x025A` value which shows
where in the binary this region is pointing to. These values are absolute from
the beginning of the file, their calculation from the raw value in the 4th
column depends on the context.

0 comments on commit 2d86857

Please sign in to comment.