-
Notifications
You must be signed in to change notification settings - Fork 3.3k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* `quick_start.md`: Add quick start guide * `annotation.md`: Add section on annotating flatbuffers
- Loading branch information
1 parent
0f90dc8
commit 2d86857
Showing
2 changed files
with
154 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,152 @@ | ||
# Annotating FlatBuffers | ||
|
||
This provides a way to annotate flatbuffer binary data, byte-by-byte, with a | ||
schema. It is useful for development purposes and understanding the details of | ||
the internal format. | ||
|
||
## Annotating | ||
|
||
Given a `schema`, as either a plain-text (`.fbs`) or a binary schema (`.bfbs`), | ||
and `binary` file(s) that were created by the `schema`. You can annotate them | ||
using: | ||
|
||
```sh | ||
flatc --annotate SCHEMA -- BINARY_FILES... | ||
``` | ||
|
||
This will produce a set of annotated files (`.afb` Annotated FlatBuffer) | ||
corresponding to the input binary files. | ||
|
||
### Example | ||
|
||
Taken from the [tests/annotated_binary](https://github.com/google/flatbuffers/tree/master/tests/annotated_binary). | ||
|
||
```sh | ||
cd tests/annotated_binary | ||
../../flatc --annotate annotated_binary.fbs -- annotated_binary.bin | ||
``` | ||
|
||
Which will produce a `annotated_binary.afb` file in the current directory. | ||
|
||
|
||
!!! Tip | ||
|
||
The `annotated_binary.bin` is the flatbufer binary of the data contained | ||
within `annotated_binary.json`, which was made by the following command: | ||
|
||
```sh | ||
..\..\flatc -b annotated_binary.fbs annotated_binary.json | ||
``` | ||
|
||
## .afb Text Format | ||
|
||
Currently there is a built-in text-based format for outputting the annotations. | ||
A full example is shown here: | ||
[`annotated_binary.afb`](https://github.com/google/flatbuffers/blob/master/tests/annotated_binary/annotated_binary.afb) | ||
|
||
The data is organized as a table with fixed [columns](#columns) grouped into | ||
Binary [sections](#binary-sections) and [regions](#binary-regions), starting | ||
from the beginning of the binary (offset `0`). | ||
|
||
### Columns | ||
|
||
The columns are as follows: | ||
|
||
1. The offset from the start of the binary, expressed in hexadecimal format | ||
(e.g. `+0x003c`). | ||
|
||
The prefix `+` is added to make searching for the offset (compared to some | ||
random value) a bit easier. | ||
|
||
2. The raw binary data, expressed in hexadecimal format. | ||
|
||
This is in the little endian format the buffer uses internally and what you | ||
would see with a normal binary text viewer. | ||
|
||
3. The type of the data. | ||
|
||
This may be the type specified in the schema or some internally defined | ||
types: | ||
|
||
|
||
| Internal Type | Purpose | | ||
|---------------|----------------------------------------------------| | ||
| `VOffset16` | Virtual table offset, relative to the table offset | | ||
| `UOffset32` | Unsigned offset, relative to the current offset | | ||
| `SOffset32` | Signed offset, relative to the current offset | | ||
|
||
|
||
4. The value of the data. | ||
|
||
This is shown in big endian format that is generally written for humans to | ||
consume (e.g. `0x0013`). As well as the "casted" value (e.g. `0x0013 `is | ||
`19` in decimal) in parentheses. | ||
|
||
5. Notes about the particular data. | ||
|
||
This describes what the data is about, either some internal usage, or tied | ||
to the schema. | ||
|
||
### Binary Sections | ||
|
||
The file is broken up into Binary Sections, which are comprised of contiguous | ||
[binary regions](#binary-regions) that are logically grouped together. For | ||
example, a binary section may be a single instance of a flatbuffer `Table` or | ||
its `vtable`. The sections may be labelled with the name of the associated type, | ||
as defined in the input schema. | ||
|
||
An example of a `vtable` Binary Section that is associated with the user-defined | ||
`AnnotateBinary.Bar` table. | ||
|
||
``` | ||
vtable (AnnotatedBinary.Bar): | ||
+0x00A0 | 08 00 | uint16_t | 0x0008 (8) | size of this vtable | ||
+0x00A2 | 13 00 | uint16_t | 0x0013 (19) | size of referring table | ||
+0x00A4 | 08 00 | VOffset16 | 0x0008 (8) | offset to field `a` (id: 0) | ||
+0x00A6 | 04 00 | VOffset16 | 0x0004 (4) | offset to field `b` (id: 1) | ||
``` | ||
|
||
These are purely annotative, there is no embedded information about these | ||
regions in the flatbuffer itself. | ||
|
||
### Binary Regions | ||
|
||
Binary regions are contiguous bytes regions that are grouped together to form | ||
some sort of value, e.g. a `scalar` or an array of scalars. A binary region may | ||
be split up over multiple text lines, if the size of the region is large. | ||
|
||
#### Annotation Example | ||
|
||
Looking at an example binary region: | ||
|
||
``` | ||
vtable (AnnotatedBinary.Bar): | ||
+0x00A0 | 08 00 | uint16_t | 0x0008 (8) | size of this vtable | ||
``` | ||
|
||
The first column (`+0x00A0`) is the offset to this region from the beginning of | ||
the buffer. | ||
|
||
The second column are the raw bytes (hexadecimal) that make up this region. | ||
These are expressed in the little-endian format that flatbuffers uses for the | ||
wire format. | ||
|
||
The third column is the type to interpret the bytes as. For the above example, | ||
the type is `uint16_t` which is a 16-bit unsigned integer type. | ||
|
||
The fourth column shows the raw bytes as a compacted, big-endian value. The raw | ||
bytes are duplicated in this fashion since it is more intuitive to read the data | ||
in the big-endian format (e.g., `0x0008`). This value is followed by the decimal | ||
representation of the value (e.g., `(8)`). For strings, the raw string value is | ||
shown instead. | ||
|
||
The fifth column is a textual comment on what the value is. As much metadata as | ||
known is provided. | ||
|
||
### Offsets | ||
|
||
If the type in the 3rd column is of an absolute offset (`SOffet32` or | ||
`Offset32`), the fourth column also shows an `Loc: +0x025A` value which shows | ||
where in the binary this region is pointing to. These values are absolute from | ||
the beginning of the file, their calculation from the raw value in the 4th | ||
column depends on the context. |