Init Commit: Moved bproto to seperate repo
This commit is contained in:
66
docs/serialization.md
Normal file
66
docs/serialization.md
Normal file
@@ -0,0 +1,66 @@
|
||||
# Message Encoding
|
||||
|
||||
This section describes how messages are encoded/serialized and decoded/deserialized.
|
||||
bproto is binary data with a well-defined structure.
|
||||
|
||||
## Message Structure
|
||||
|
||||
Every bproto message looks like this:
|
||||
|
||||
```
|
||||
[ 1 ] [ N ] [ 2 ]
|
||||
[Message ID] [Field Data] [CRC-16]
|
||||
```
|
||||
|
||||
- **Message ID:** The first byte of the message is the message ID. This is used to identify the message type for decoding. Every message is declared with a unique ID in the protocol definition. See [Message](/docs/syntax.md#message)
|
||||
- **Field Data:** The field data is the serialized data of the message fields. See [Message Field Data Encoding](#message-field-data-encoding)
|
||||
- **CRC-16:** The last 2 bytes of the message are the CRC-16 checksum of the message data. This is used to verify the integrity of the message. See [CRC-16 Check](/docs/compatibility.md#step-2---crc-16-check) and [CRC-16 Calculation](/docs/serialization.md#crc-16-calculation)
|
||||
|
||||
### Message Field Data Encoding
|
||||
|
||||
The order, length, and type of the fields are defined in the protocol definition.
|
||||
This means that bproto message structure is fixed and can be decoded without any additional delimiters or markers.
|
||||
|
||||
The fields are just concatenated together in the order of their field ID.
|
||||
Field IDs are declared in the protocol definition and are declaration order independent (see [Syntax of Fields](/docs/syntax.md#fields)).
|
||||
|
||||
All numeric data types are serialized in big-endian byte order.
|
||||
Data types are serialized as follows:
|
||||
- For floating point numbers, the IEEE 754 standard is used.
|
||||
- Signed integers are serialized as two's complement.
|
||||
- Unsigned integers are serialized as is.
|
||||
- Char and string are serialized as ASCII characters.
|
||||
- Bool is serialized as 0 for false and 1 for true.
|
||||
- Enums are serialized as 8-bit unsigned integers.
|
||||
- Bitfields are serialized as a single (or multiple) byte with each bit representing a boolean value. The order of the bits is from the least significant bit to the most significant bit and from the first byte to the last byte.
|
||||
|
||||
Datatype sizes are fixed and are documented in the datatypes-table in the [Datatypes](/docs/syntax.md#datatypes) section.
|
||||
compatiblity
|
||||
For arrays, the elements are serialized in order, with the first element first, the second element second, and so on.
|
||||
|
||||
### Example Message Encoding
|
||||
|
||||
An example of a message and serialization with 3 fields:
|
||||
```
|
||||
message [1] ExampleMessage {
|
||||
[0] a : uint8,
|
||||
[1] b : uint16,
|
||||
[2] c : bool
|
||||
}
|
||||
```
|
||||
with the values a=42, b=12345, c=true would be serialized as: `012a390501`
|
||||
```
|
||||
[Message ID] [Field a] [Field b] [Field c] [CRC16]
|
||||
[ 01 ] [ 2a ] [ 3905 ] [ 01 ] [31 06]
|
||||
```
|
||||
|
||||
### CRC-16 Calculation
|
||||
|
||||
The CRC-16 has the purpose of verifying the integrity of the message and preventing incompatible messages/protocol versions from being processed.
|
||||
|
||||
When performing the CRC-16 calculation, the CRC-16 is initialized by first prepending a SHA-256 hash of a text form of the protocol definition to the message data.
|
||||
This text hash is a fingerprint of the protocol definition and is used to detect changes in the protocol definition. See [Compatibility Validation Process in Detail](/docs/compatibility.md#compatibility-validation-process-in-detail)
|
||||
|
||||
Then CRC-16 is calculated over the message data and compared to the last 2 bytes of the message.
|
||||
If the CRC-16 is not correct, the message should be discarded.
|
||||
|
||||
Reference in New Issue
Block a user