Files
bproto/docs/serialization.md
2025-04-14 14:43:03 +02:00

3.3 KiB

Message Encoding

This section describes how messages are encoded/serialized and decoded/deserialized. bproto is binary data with a well-defined structure.

Message Structure

Every bproto message looks like this:

[     1    ] [     N    ] [   2  ]
[Message ID] [Field Data] [CRC-16]
  • Message ID: The first byte of the message is the message ID. This is used to identify the message type for decoding. Every message is declared with a unique ID in the protocol definition. See Message
  • Field Data: The field data is the serialized data of the message fields. See Message Field Data Encoding
  • CRC-16: The last 2 bytes of the message are the CRC-16 checksum of the message data. This is used to verify the integrity of the message. See CRC-16 Check and CRC-16 Calculation

Message Field Data Encoding

The order, length, and type of the fields are defined in the protocol definition. This means that bproto message structure is fixed and can be decoded without any additional delimiters or markers.

The fields are just concatenated together in the order of their field ID. Field IDs are declared in the protocol definition and are declaration order independent (see Syntax of Fields).

All numeric data types are serialized in big-endian byte order. Data types are serialized as follows:

  • For floating point numbers, the IEEE 754 standard is used.
  • Signed integers are serialized as two's complement.
  • Unsigned integers are serialized as is.
  • Char and string are serialized as ASCII characters.
  • Bool is serialized as 0 for false and 1 for true.
  • Enums are serialized as 8-bit unsigned integers.
  • Bitfields are serialized as a single (or multiple) byte with each bit representing a boolean value. The order of the bits is from the least significant bit to the most significant bit and from the first byte to the last byte.

Datatype sizes are fixed and are documented in the datatypes-table in the Datatypes section. compatiblity For arrays, the elements are serialized in order, with the first element first, the second element second, and so on.

Example Message Encoding

An example of a message and serialization with 3 fields:

message [1] ExampleMessage {
    [0] a : uint8,
    [1] b : uint16,
    [2] c : bool
}

with the values a=42, b=12345, c=true would be serialized as: 012a390501

[Message ID] [Field a] [Field b] [Field c] [CRC16]
[    01    ] [  2a   ] [ 3905  ] [  01   ] [31 06]

CRC-16 Calculation

The CRC-16 has the purpose of verifying the integrity of the message and preventing incompatible messages/protocol versions from being processed.

When performing the CRC-16 calculation, the CRC-16 is initialized by first prepending a SHA-256 hash of a text form of the protocol definition to the message data. This text hash is a fingerprint of the protocol definition and is used to detect changes in the protocol definition. See Compatibility Validation Process in Detail

Then CRC-16 is calculated over the message data and compared to the last 2 bytes of the message. If the CRC-16 is not correct, the message should be discarded.