Skip to content

Psi Store Format

Nick Saw edited this page Mar 14, 2024 · 1 revision

Here we describe the byte-level format of \psi stores so that they may be potentially consumed outside of \psi and outside of .NET. The same format is used for remoting with an additional handshake protocol documented here.

Referenece implementations are available for Python and for F# (though, being a .NET language, \psi itself may be used from F#).

Store Usage within \psi

As described in the Brief Introduction and expounded on in the section covering Datasets, a \psi store is an extremely efficient and easy way to persist \psi streams to disk. For example:

var store = PsiStore.Create(pipeline, "MyStore", "~/Some/Path");
myStream.Write("MyStream", store);
myOtherStream.Write("OtherStream", store);

Stores may be explored and visualized in PsiStudio or may be opened for playback within a pipeline. For example:

var store = PsiStore.Open(pipeline, "MyStore", "~/Some/Path");
var myStream = store.OpenStream<double>("MyStream");
var myOtherStream = store.OpenStream<MyType>("OtherStream");

Many streams may be written to a single store and streams may be of any .NET type, including user-provided types. When reading back streams, the types must be known and given as the type parameter (T) to the generic OpenStream<T> API. This also implies that the types are available to the consuming application (i.e., the correct assembly references have been made).

Having the types is not strictly necessary in order to recover the data. The store contains type-schema information defining the shape of the data, decomposing to a small list of primitives. To read a stream for which the .NET types are unknown or unavailable, the OpenDynamicStream(...) API may be used to open a stream of dynamic objects having members with names and leaf-node types matching the original type (e.g. MyType):

var myOtherStream = store.OpenDynamicStream("OtherStream");

This same type-schema metadata may also be used to drive parsing and reconstruction of message data from persisted \psi stores within ecosystems outside of \psi and outside of .NET (e.g. from Python), which we will demonstrate as we go.

Byte-level Store Format

A \psi store is a collection of files on disk, containing serialized data from all of the streams that have been written to it, as well as a catalog of metadata information and indexes to facilitate random access:

MyStore.Catalog_000000.psi
MyStore.Data_000000.psi
MyStore.Data_000001.psi
MyStore.Index_000000.psi
MyStore.LargeData_000000.psi
MyStore.LargeData_000002.psi
MyStore.LargeData_000003.psi
MyStore.Live

A \psi store comprises three main types of files:

  • Data \ LargeData containing serialized message data.
  • Catalog containing metadata describing the available streams; statistics and type information.
  • Index containing index entries to facilitate random access to message data.

Each is broken into numbered extents and given names in the form <StoreName>.<Type>_<extent>.psi. Each type of file represents a contiguous set, but is broken into extents to limit individual file sizes. Data, for example, is broken into 256MB files, numbered *.Data_000000.psi, *.Data_000001.psi, ...

Additionally, *.Live is a marker file indicating that the store is actively being written to. For example, it may be quite useful to view a store in PsiStudio while the writing application is running.

Blocks and "Infinite" Files

Data within each file is broken into "blocks" of opaque packets of bytes. Blocks are written sequentially until each "extent" is filled and a new extent is created. These chains of extents are sometimes referred to as "infinite files" because, in a live application, they represent data from or about streams that may accumulate indefinitely. The only limit is disk space.

All the types of *.psi files have the same foundational structure. Starting with the first (*.*_000000.psi) file, opaque blocks of data within may be read. Each is a length-prefixed block of bytes:

Length Block ... Length Block ... ...
Int32 n-Bytes Int32 n-Bytes

The length is a 4-byte little-endian signed integer. Given a positive length, this is followed by n-bytes comprising the block. Rinse and repeat.

A length of zero signals the end of the chain of blocks. However, if the *.Live marker file is present, this could be temporary while waiting for more data to be written.

A negative length indicates the end of an extent. The absolute value of the length is then the extent number with which to continue. For example, a -1 length in extent *_000000 means to continue with extent *_000001.

Booleans are encoded as single-byte values where 1 signifies True. Floating point values are 32-/64-bit IEEE. Fixnum integer values are little-endian. DateTimes and strings are discussed further in the Parsing Primitives section below.

Next we'll talk about the meaning of the payload of the opaque blocks of bytes within each type of file.

Index Files

Tools such as PsiStudio may want to quickly seek to a particular time within the data store. To facilitate this, the Index stores information about where to look for messages corresponding to a particular time.

Each entry represents a byte-position within an extent, along with the largest creation- and originating-time seen up to that point.

Extent Position CreationTime OriginatingTime
Int32 Int32 DateTime DateTime

Entries are not written for every message recorded, but instead are triggered by a threshold of data (i.e., 4KB) having been written. This potentially sparse indexing is enough to seek to the vicinity of a desired time stamp and begin reading.

The Extent and Position fields are 4-byte little-endian signed integers. As we'll discuss below, message data may be stored in the Data extents or in the LargeData extents. A positive Extent number in an index entry indicates a (non-large) Data file. For example an Extent of 1 means to look in *.Data.000001.psi. A negative Extent number is encoded such that adding 2�� (2,147,483,648) to it yields the LargeData extent number. For example, an Extent of -2147483647 means to look in *.LargeData_000001.psi.

The Position is the byte-position within that corresponds to the Length field of a block. For example, a Position of 12345 means to skip to the 12345th byte and begin reading blocks (4-byte length, followed by n-bytes as usual).

The CreationTime and OriginatingTime fields represent the largest time value seen up to the Position in the given Extent. There may be messages between indexed points. They are encoded into 8-bytes, the lower 62-bits of which represent the 100-nanosecond ticks since 1/1/0001 12:00AM in UTC (see the Parsing Primitives/DateTime section below).

Catalog Files

Catalog files contain metadata information about the runtime, the streams contained in the store and schema information covering the message types. Metadata blocks begin with:

Name Id TypeName Version SerializerTypeName SerializerVersion CustomFlags Kind
String Int32 String Int32 String Int32 UInt16 UInt16

The meaning of the fields depends on the Kind of metadata, as described below. String fields are length-prefixed, UTF-8 bytes as described in the Parsing Primitives section below. The Int32 fields are 4-byte little-endian signed integers while the UInt16 fields are 2-byte unsigned little-endian.

The Kind field indicates the kind of metadata the record, of which there are three:

  • StreamMetadata (Kind = 0) describes a stream contained in the store.
  • RuntimeInfo (Kind = 1) describes the \psi runtime and hosting application that persisted the store.
  • TypeSchema (Kind = 2) describes a datatype used in a message stream.

RuntimeInfo

RuntimeInfo fields have the following contents. A single such record is written at the start of the catalog.

  • Name - the full name of the Microsoft.Psi assembly used to persist the store (e.g Microsoft.Psi, Version=..., Culture=neutral, PublicKeyToken=...).
  • Id - Unused (0).
  • TypeName - Full name of the Microsoft.Psi assembly.
  • Version - The Microsoft.Psi assembly version encoded as 16-bit major followed by 16-bit minor.
  • SerializerTypeName - Unused (null).
  • SerializerVersion - The runtime version (e.g. 2, not the assembly version, but a small number incremented as breaking changes are made).
  • CustomFlags - Unused (0).

StreamMetadata

StreamMetadata main fields have the following contents. A record for each stream is written once on pipeline start and again with updated information when streams are closed.

  • Name - Stream name (give to the Write(<name>, store) API).
  • Id - An ID assigned by the \psi runtime.
  • TypeName - The full assembly-qualified .NET type name of the type of messages on the stream.
  • Version - The \psi metadata version (e.g. 2, incremented as breaking changes are made).
  • SerializerTypeName - Unused (null).
  • SerializerVersion - Unused (0)
  • CustomFlags - indicate whether the stream is persisted, closed, indexed, and/or polymorphic.

StreamMetadata contains additional fields describing the wall-clock time extents of the stream, the number of messages written to the stream, and statistics around the message size and latency:

OpenedTime ClosedTime MessageCount MessageSizeCumulativeSum LatencyCumulativeSum
DateTime DateTime Int64 Int64 Int64

The parsing of 8-byte DateTime values has been described briefly above and in detail in the Parsing Primitives section below. The Int64 values are 8-byte little-endian signed integers.

  • OpenedTime - the wall-clock time that the stream was opened.
  • ClosedTime - the wall-clock time that the stream was closed (updated upon close).
  • MessageCount - the number of messages on the stream (updated upon close).
  • MessageSizeCumulativeSum - the total bytes of all messages on the stream (updated upon close).
  • LatencyCumulativeSum - the total 100-nanosecond ticks of accumulated latency (difference between creation- and originating-times, updated upon close).

Some of these fields (and some below) may be updated in subsequent records for a given stream. For example, ClosedTime, MessageCount, etc. may contain dummy (e.g. 0) values. Once a stream has been closed the final record can be expected to be accurate. The CustomFlags field (described below) may be used to determine whether a record represents a closed stream.

Following these fields are additional fields indicating the first and last message creation- and originating-times:

FirstMessageCreationTime LastMessageCreationTime FirstMessageOriginatingTime LastMessageOriginatingTime
DateTime DateTime DateTime DateTime

The CustomFlags field contains bits indicates several non-mutually exclusive attributes about a stream:

  • 0x01 - Not persisted to the store.
  • 0x02 - The stream has closed.
  • 0x04 - The stream is indexed, meaning that index entries appear in Data while message payload appears in LargeData, as discussed below in the Data Files section.
  • 0x08 - The stream contains a polymorphic message type, as discussed next.

Polymorphic streams may be of various concrete types. In this case, the TypeName indicates the polymorphic type name but, for the purpose of parsing such streams, the possible concrete types must be known. Thus, StreamMetadata marked with CustomFlags indicating a polymorphic message type (mask 0x08) are followed by a list of the concrete types (and updated StreamMetadata records are emitted as new concrete types are encountered on the stream):

TypeCount TypeId TypeName TypeId TypeName ...
Int32 Int32 String Int32 String ...

The TypeCount indicates the number of concrete types (TypeId/TypeName pairs) to follow. Each TypeName is a full assembly-qualified .NET type name while the TypeId is an index value assigned by the \psi runtime.

Following this are several fields describing supplemental stream metadata, which is a value of a user-specified type. This is represented as opaque serialized bytes along with the type name. Deserialization is identical to message payload deserialization as described below in the Data Files section.

SupplementalTypeName SupplementalLength SupplementalBytes
String Int32 n-Bytes

The SupplementalTypeName is a full assembly-qualified .NET type name. The SupplementalLength indicates the number of bytes comprising the SupplementalBytes.

It should be noted that the above format applies to the current version (version 2, 23 APR 2021) of the StreamMetadata format. The Version field indicates the version actually being read. In prior versions, the MessageCount field was an Int32 rather than an Int64 and the MessageSizeCumulativeSum and LatencyCumulativeSum were not present. Instead, following the LastMessageOriginatingTime field, an AverageMessageSize (Int32) and AverageMessageLatency (Int32, in microseconds) field gave similar statistics. Finally, the idea of supplemental stream metadata, and so the Supplemental* fields, were added in version 1 (16 JUN 2020).

TypeSchema

In order to parse message data (and supplemental stream metadata), the schema of the data must be known. The TypeSchema catalog records serve this purpose. The main fields have the following contents:

  • Name - The contract name within a composite type.
  • Id - A unique ID assigned by by the \psi runtime.
  • TypeName - The full assembly-qualified .NET type name.
  • Version - The version of the serializer that generated the schema (used for versioning of custom serializers, as described in the Data File section).
  • SerializerTypeName - The full assembly-qualified .NET type name of the serializer that generated the schema.
  • SerializerVersion - The version of the serializer that generated the schema (used for versioning of custom serializers, as described in the Data File section)
  • CustomFlags - Unused (0).

Following this are fields describing members of the type:

MemberCount Name Type IsRequired Name Type IsRequired ... TypeFlags
Int32 String String Boolean String String Boolean ... UInt32

The MemberCount indicates the number of members (Name/Type/IsRequired triples) to follow. Each Name is the simple member name, while the Type is a full assembly-qualified .NET type name. The IsRequired flag is a single-byte boolean (1 = false, 1 = true) indicating whether the member is required (to support optional members for backwards compatibility).

The final TypeFlags field in a 4-byte little-endian unsigned integer indicating the category of the type:

  • 0x01 - The type is a class (reference type).
  • 0x02 - The type is a struct (value type).
  • 0x04 - The type is a contract (interface type).
  • 0x08 - The type is a collection (enumerable type).

It should be noted that the TypeFlags field was added in version 2 (30 NOV 2018) of the TypeSchema format. Prior to this, a determination would need to be made from the Type name alone, which implies reflection of the underlying .NET type, or a mapping of such information. For this reason, a non-.NET \psi store reader that is backward compatible with older stores may be difficult to robustly implement for unexpected types.

Some schema names do not represent the assembly-qualified .NET type name. Instead, the name has been taken from a data contract attibute. However, the type names within schemas (and stream type names) are always assembly-qualified type names. To overcome this, we must provide a mapping of known data contract names to their equivalent type names.

Data Files

The data files represent streams of contiguous messages. The Data and LargeData files contain nearly identically formatted message data, interleaved among all of the streams written to the store. Each block begins with a message envelope:

SourceId SequenceId OriginatingTime CreationTime
Int32 Int32 DateTime DateTime

The SourceId is a stream ID assigned by the \psi runtime. It corresponds to the Id field of a StreamMetadata entry from the Catalog. The SequenceId is a monotonically increasing message number within the stream. The OriginatingTime is the time at which the message originated in the world (at the source sensor, etc.) while the CreationTime is the pipeline time at which the message was last propagated.

Warning: The OriginatingTime and CreationTime fields are (for no particular reason) ordered differently to the same fields within index entries.

Remember that stream metadata may contain CustomFlags marking a stream as being "indexed" (mask 0x04). Such streams are treated specially by writing the message blocks to the LargeData file and writing an "index entry" pointing to it into the Data file. This means that in order to know how to interpret a message block within the Data file, it much be known whether the stream is indexed. For this, we must refer to StreamMetadata entries previously read from the Catalog. The SourceId of the message will correspond to the Id field of a StreamMetadata. The CustomFlags of which will tell us whether the stream is indexed.

For an indexed stream in the Data file, the envelop will be followed by an "index entry" identical to those found in the Index files:

Extent Position CreationTime OriginatingTime
Int32 Int32 DateTime DateTime

The Extent and Position will point to the LargeData file set for the actual message data. The LargeData file never contains index entries and may be read independently as a mere sequence of interleaved "large" messages or may be indexed into. It should be noted that the interleaving of messages roughly as they occurred in the pipeline is preserved in the "small" Data file set, while the LargeData files are only interleaved with other indexed "large" streams.

For streams that are not indexed (or for streams in the LargeData files), further parsing is driven by type-schema information previously gathered from the Catalog. Beginning with the TypeSchema corresponding to the SourceId, we parse the message bytes. A reference implementation can be found in the DymanicMessageDeserializer.

Primitives

The base case is when the TypeName is a known primitive and can be parsed directly. Names are assembly-qualified, but it is safe to consider up to the first ',' character (e.g. pseudocode typeName.split(',')[0]). The following are the simple primitive types:

  • "System.Single" (32-bit IEEE floating point)
  • "System.Double" (64-bit IEEE floating point)
  • "System.SByte" (signed byte)
  • "System.Byte" (unsigned byte)
  • "System.Int16" (signed 2-byte little-endian integer)
  • "System.UInt16" (unsigned 2-byte little-endian integer)
  • "System.Int32" (signed 4-byte little-endian integer)
  • "System.UInt32" (unsigned 4-byte little-endian integer)
  • "System.Int64" (signed 8-byte little-endian integer)
  • "System.UInt64" (unsigned 8-byte little-endian integer)
  • "System.Boolean" (1-byte, 0 = false, 1 = true)
  • "System.Char" (2-byte UTF-16 character)
  • "System.DateTime" (8-byte DateTime)
  • "System.String" (length-prefixed UTF-8 bytes)

Parsing of some of these types is described in more detail below in the Parsing Primitives section.

Reference Strings

Strings (System.String) are somewhat special. As a reference type, strings may be preceded by a prefix allowing for reusable instances. This prefix is present only when the string is not a member of a collection (see below).

Prefix
UInt32

The the low 30-bits may contain a ID, while the high 2-bits indicate the kind of prefix:

High Bits Low Bits (mask 0x3FFFFFFF) Meaning
b00 (0x0) ... Null
b01 (0x4) Instance Ordinal Existing Instance
b10 (0x8) ... New Instance
b11 (0xC) TypeSchema ID Typed

A Prefix mask of 0x0 means that the string is null (string parsing already supports null by way of a -1 length, but this is an additional mechanism used for reference types in general). In this case, the Prefix is not followed by a string.

A Prefix mask of 0x80000000 means that this is a new string and is followed by a string to parse:

Prefix Length UTF-8 Encoded Bytes
0x8 Int32 n-Bytes

This new instance should be tucked away in a per-stream "cache" keyed by ordinal. This is because a Prefix mask of 0x4 means that a string is an existing instance from the cache (to intern strings for efficiency). A value mask of 0x3FFFFFFF yields the zero-based index into the cache. For example, a Prefix mask of 0x40000007 means to return the 8th cached instance.

Composite Types

All other types are compositions of the primitive types. Based on the SourceId, type-schema information previously read from the Catalog should be retrieved. Remember that the TypeFlags field of the TypeSchema will determine whether we are reading a struct, a class, a contract, or a collection.

All composite types except structs (i.e., classes, contracts, and collections) are treated as reference types and have a Prefix field similar to strings.

Prefix
UInt32

Just as with strings, a Prefix mask of 0x0 means that the value is null and the Prefix is not followed by value to parse.

Again the New and Existing flags avoid the inefficiency of encoding the same value many times and (more importantly) to encode cyclic structures. It is up to the deserializer to maintain a cache on a per-message basis of ref instances deserialized thus far. If a given value is contained more than once, the subsequent instances will be merely encoded as Existing, with the lower 30-bits giving an ID (cached ordinal) used to recover the shared instance.

Unlike strings, other reference types may be polymorphic or concrete implementations of a contract (interface). That is for example, a stream may be of type T but messages on the stream may be of implementation or subtypes U or V. A Prefix mask of 0xC indicates a concrete subtype. A value mask of 0x3FFFFFFF yields the Id of the TypeSchema with which to drive parsing.

A Prefix mask of 0x8 again means that this is a new instance and is followed by a value to parse (the precise type of which depends on the type-schema -- described later):

Prefix Schema-driven Value
0x8 ...

Once parsed, this new instance too should be tucked away in the per-stream "cache" keyed by ordinal. This is both for efficiency and to allow serialization of object graphs containing cycles.

A Prefix mask of 0x4 again means that an existing instance from the cache should be returned and the Prefix is not followed by a value to parse. A value mask of 0x3FFFFFFF yields the zero-based index into the cache. For example, a Prefix mask of 0x40000007 means to return the 8th cached instance.

Members

Composite types other than collections (i.e., structs, classes, and contracts) comprise a set of named members.

Reference types (i.e. classes and contracts, but not structs) should be added to the per-stream instance cache. This should be done (with dummy instances) before parsing members in order to support circular references.

To populate an instance, walk the TypeSchema member Name/Type information in the order that it was originally read from the Catalog and parse and assign each member value (again, consider this document to be recursive!).

Collections

Collections (indicated by TypeFlags field of TypeSchema) are length-prefixed sets of elements. These are composite types as well and so are preceded by a Prefix and the semantics of that explained in the previous section.

Length n-Elements ...
UInt32 ...

Collections are homogenous and the type can be found on the "Elements" member of the TypeSchema, which happens to always be the first member (e.g. pseudocode schema.Member[0].Type).

Parsing each element, driven by the TypeSchema, produces the collection (consider this document to be recursive!).

It should be noted that strings are treated specially. Normally, string values are preceded by a Prefix field as discussed above. But when strings are members of collections they are parsed as bare values (as if they followed a Prefix indicating a new instance). They are still added to the per-stream instance cache of reference type instances.

Also, a set of deserializers must provide a mapping from type names to a corresponding deserilization function. It is pre-populated with the .NET primitives along with several custom serializers (which cannot be derived purely by type schema information). The single exception is the System.String deserializer, because it is a reference type with interations with the internal instance cache. These deserializers are explained in detail in the Custom Serializers section below.

Parsing Primitives

DateTime

The DateTime type is encoded in 8-bytes, representing .NET DateTime values which may be parsed as follows:

Kind Ticks
2-bits 62-bits

The Kind flags are described here, but \psi always uses UTC (high bit set). The lower 62-bits (mask 0x3FFFFFFFFFFFFFFF) represent 100-nanosecond ticks since 1/1/0001 12:00AM.

String

Strings are a length-prefixed string of UTF-8 encoded bytes.

Length UTF-8 Encoded Bytes
Int32 n-Bytes

When the Length field is -1 then the value is null, whereas when the Length is 0 the value is a null-string (""). For positive Length values, n-bytes follow and should be interpreted as a UTF-8 encoded character string.

Boolean

Booleans are single-byte values where 1 indicates True.

Store Reader

Reading a stream within a store is accomplished by reading data and filtering the to the stream in which we're interested -- that is, filtering to envelopes with a sourceId matching the desired stream ID. Remember that stream data is interleaved in the store. A more convenient API may be to read a stream by name rather than by ID. For this, we need to find the stream metadata for the given name in the catalog and extract the ID.

However this would be an inefficent way to read many streams from a store. For a single-pass implimentation that reads multiple streams at once, filter to multiple streams and dispatch messages on a per-stream basis.

Random Access

If the goal is to randomly access the data, exploring by timestamps (i.e., PsiStudio) then the Index is used to quickly seek within the data files. Likely the whole index can be maintained in memory to direct loading, seeking within and reading of individual Data/LargeData extents on demand. Individual entries may be converted to an extent ID, position and indication of whether data can be found in the regular or large extents.

Custom Deserializers

Some types have hand-crafted serializers in \psi. Some of these produce type schema infomation that fully describes their format and so need no special treatment. Others need matching hand-crafted deserializers. The data reader handles reading the prefix, caching instances, etc. Custom deserializers need only to handle actual instance bytes.

MemoryStream

A MemoryStream is serialized as a simple length-prefixed array of bytes:

Length Buffer ...
Int32 n-Bytes
Clone this wiki locally