Init Commit: Moved bproto to seperate repo

2025-04-14 14:43:03 +02:00
commit 45bfc724fc
125 changed files with 10822 additions and 0 deletions
--- a/docs/serialization.md
+++ b/docs/serialization.md
@@ -0,0 +1,66 @@
+# Message Encoding
+
+This section describes how messages are encoded/serialized and decoded/deserialized.
+bproto is binary data with a well-defined structure.
+
+## Message Structure
+
+Every bproto message looks like this:
+
+```
+[     1    ] [     N    ] [   2  ]
+[Message ID] [Field Data] [CRC-16]
+```
+
+- **Message ID:** The first byte of the message is the message ID. This is used to identify the message type for decoding. Every message is declared with a unique ID in the protocol definition. See [Message](/docs/syntax.md#message)
+- **Field Data:** The field data is the serialized data of the message fields. See [Message Field Data Encoding](#message-field-data-encoding)
+- **CRC-16:** The last 2 bytes of the message are the CRC-16 checksum of the message data. This is used to verify the integrity of the message. See [CRC-16 Check](/docs/compatibility.md#step-2---crc-16-check) and [CRC-16 Calculation](/docs/serialization.md#crc-16-calculation)
+
+### Message Field Data Encoding
+
+The order, length, and type of the fields are defined in the protocol definition.
+This means that bproto message structure is fixed and can be decoded without any additional delimiters or markers.
+
+The fields are just concatenated together in the order of their field ID.
+Field IDs are declared in the protocol definition and are declaration order independent (see [Syntax of Fields](/docs/syntax.md#fields)).
+
+All numeric data types are serialized in big-endian byte order.
+Data types are serialized as follows:
+- For floating point numbers, the IEEE 754 standard is used.
+- Signed integers are serialized as two's complement.
+- Unsigned integers are serialized as is.
+- Char and string are serialized as ASCII characters.
+- Bool is serialized as 0 for false and 1 for true.
+- Enums are serialized as 8-bit unsigned integers.
+- Bitfields are serialized as a single (or multiple) byte with each bit representing a boolean value. The order of the bits is from the least significant bit to the most significant bit and from the first byte to the last byte.
+
+Datatype sizes are fixed and are documented in the datatypes-table in the [Datatypes](/docs/syntax.md#datatypes) section.
+compatiblity
+For arrays, the elements are serialized in order, with the first element first, the second element second, and so on.
+
+### Example Message Encoding
+
+An example of a message and serialization with 3 fields:
+```
+message [1] ExampleMessage {
+    [0] a : uint8,
+    [1] b : uint16,
+    [2] c : bool
+}
+```
+with the values a=42, b=12345, c=true would be serialized as: `012a390501`
+```
+[Message ID] [Field a] [Field b] [Field c] [CRC16]
+[    01    ] [  2a   ] [ 3905  ] [  01   ] [31 06]
+```
+
+### CRC-16 Calculation
+
+The CRC-16 has the purpose of verifying the integrity of the message and preventing incompatible messages/protocol versions from being processed.
+
+When performing the CRC-16 calculation, the CRC-16 is initialized by first prepending a SHA-256 hash of a text form of the protocol definition to the message data.
+This text hash is a fingerprint of the protocol definition and is used to detect changes in the protocol definition. See [Compatibility Validation Process in Detail](/docs/compatibility.md#compatibility-validation-process-in-detail)
+
+Then CRC-16 is calculated over the message data and compared to the last 2 bytes of the message.
+If the CRC-16 is not correct, the message should be discarded.
+