Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

nim-cbor-serialization is a library in the nim-serialization family for turning Nim objects into CBOR and back. Features include:

  • Efficient coding of CBOR directly to and from Nim data types
    • Full type-based customization of both encoding and decoding
    • Flavors for defining multiple CBOR serialization styles per Nim type
    • Efficient skipping of data items for partial CBOR parsing
  • Flexibility in mixing type-based and dynamic CBOR access
    • Structured CborValueRef node type for DOM-style access to parsed data
    • Flat CborBytes type for passing nested CBOR data between abstraction layers
  • RFC8949 spec compliance
    • Passes CBORTestVectors
    • Customizable parser strictness including support for non-standard extensions
  • Well-defined handling of malformed / malicious inputs with configurable parsing limits

Installation

As a nimble dependency:

requires "cbor_serialization"

Via nimble install:

nimble install cbor_serialization

API documentation

This guide covers basic usage of cbor_serialization - for details, see the API reference.

Getting started

cbor_serialization is used to parse CBOR directly into Nim types and to encode them back as Cbor efficiently.

Let's start with a simple CBOR-RPC example based on JSON-RPC:

rpc-message = {
  cborrpc: tstr .eq "2.0",
  method: tstr .eq "subtract",
  params: [ int, int ],
  id: int
}

Imports and exports

Before we can use cbor_serialization, we have to import the library.

If you put your custom serialization code in a separate module, make sure to re-export cbor_serialization:

{.push gcsafe, raises: [].} # Encourage exception handling hygiene in procedures!

import cbor_serialization
export cbor_serialization

A common way to organize serialization code is to use a separate module named either after the library (mylibrary_cbor_serialization) or the flavor (myflavor_cbor_serialization).

For types that mainly exist to interface with CBOR, custom serializers can also be placed together with the type definitions.

Re-exports

When importing a module that contains custom serializers, make sure to re-export it or you might end up with cryptic compiler errors or worse, the default serializers being used!

Simple reader

Looking at the example, we'll define a Nim object to hold the request data, with matching field names and types:

type Request = object
  cborrpc: string
  `method`: string # Quote Nim keywords
  params: seq[int] # Map CBOR array to `seq`
  id: int

Cbor.encode can now turn our Request into a CBOR blob:

# Encode a Request type into a CBOR blob
let encoded =
  Cbor.encode(Request(cborrpc: "2.0", `method`: "subtract", params: @[42, 3], id: 1))

Cbor.decode can now turn our CBOR input back into a Request:

# Decode the CBOR blob into our Request type
let decoded = Cbor.decode(encoded, Request)

doAssert decoded.id == 1

Replace decode/encode with loadFile/saveFile to read and write a file instead!

Handling errors

Of course, someone might give us some invalid data - cbor_serialization will raise an exception when that happens:

try:
  # Oops, a string was used for the `id` field!
  discard Cbor.decode(Cbor.encode((id: "test")), Request)
  doAssert false
except CborError as exc:
  # "<string>" helps identify the source of the data - this can be a
  # filename, URL or something else that helps the user find the error
  echo "Failed to parse data: ", exc.formatMsg("<string>")

Custom parsing

Happy we averted a crisis by adding the forgotten exception handler, we go back to the JSON-RPC specification and notice that strings are actually allowed in the id field - further, the only thing we have to do with id is to pass it back in the response - we don't really care about its contents.

We'll define a helper type to deal with this situation and attach some custom parsing code to it that checks the type. Using CborBytes as underlying storage is an easy way to pass around snippets of CBOR whose contents we don't need.

The custom code is added to readValue/writeValue procedures that take the stream and our custom type as arguments:

type CborRpcId = distinct CborBytes

proc readValue*(
    r: var Cbor.Reader, val: var CborRpcId
) {.raises: [IOError, CborReaderError].} =
  let ckind = r.parser.cborKind()
  case ckind
  of CborValueKind.Unsigned, CborValueKind.Negative, CborValueKind.String,
      CborValueKind.Null:
    # Keep the original value without further processing
    var raw: CborBytes
    r.parseValue(raw)
    val = CborRpcId(raw)
  else:
    r.parser.raiseUnexpectedValue("Invalid RequestId, got " & $ckind)

proc writeValue*(w: var Cbor.Writer, val: CborRpcId) {.raises: [IOError].} =
  w.writeValue(CborBytes(val)) # Preserve the original content

Usage example:

type Request = object
  cborrpc: string
  `method`: string
  params: seq[int]
  id: CborRpcId # CBOR blob

let encoded = Cbor.encode(Request(id: Cbor.encode("test").CborRpcId))
let decoded = Cbor.decode(encoded, Request)
doAssert Cbor.decode(decoded.id.CborBytes.toBytes(), string) == "test"

Flavors and strictness

While the defaults that cbor_serialization offers are sufficient to get started, implementing CBOR-based standards often requires more fine-grained control, such as what to do when a field is missing, unknown or has high-level requirements for parsing and output.

We use createCborFlavor to declare the new flavor passing to it the customization options that we're interested in:

createCborFlavor CrpcSys,
  automaticObjectSerialization = false,
  requireAllFields = true,
  omitOptionalFields = true, # Don't output `none` values when writing
  allowUnknownFields = false

CrpcSys.defaultSerialization(Result)

Required and optional fields

In the CBOR-RPC example, both the cborrpc version tag and method are required while parameters and id can be omitted. Our flavor required all fields to be present except those explicitly optional - we use Opt from results to select the optional ones:

type Request = object
  cborrpc: string
  `method`: string
  params: Opt[seq[int]]
  id: Opt[CborRpcId]

Automatic object conversion

The default Cbor flavor allows any object to be converted to CBOR. If you define a custom serializer and someone forgets to import it, the compiler might end up using the default instead resulting in a nasty runtime surprise.

automaticObjectSerialization = false forces a compiler error for any type that has not opted in to be serialized:

# Allow serializing the `Request` type - serializing other types will result in
# a compile-time error because `automaticObjectSerialization` is false!
CrpcSys.defaultSerialization Request

With all that work done, we can finally use our custom flavor to encode and decode the Request:

let cbor = Cbor.encode(
  Request(
    cborrpc: "2.0",
    `method`: "subtract",
    params: Opt.some(@[42, 3]),
    id: Opt.some(Cbor.encode(1).CborRpcId),
  )
)

let decoded = CrpcSys.decode(cbor, Request)
echo decoded

More examples

Further examples of how to use cbor_serialization can be found in the tests folder.

Streaming

CborWriter can be used to incrementally write CBOR data items.

Incremental processing is ideal for large data items or when you want to avoid building the entire CBOR structure in memory.

Writing

You can use CborWriter to write CBOR objects, arrays, and values step by step, directly to a file or any output stream.

The process is similar to when you override writeValue to provide custom serialization.

Example: Writing a CBOR Array of Objects

Suppose you want to write a large array of objects to a file, one at a time:

import cbor_serialization, stew/[byteutils]

var output = memoryOutput()
var writer = Cbor.Writer.init(output)

writer.beginArray()

for i in 0 ..< 2:
  writer.beginObject()

  writer.writeField("id", i)
  writer.writeField("name", "item" & $i)

  writer.endObject()

writer.endArray()

echo output.getOutput(seq[byte]).to0xHex()

This produces the following output when the resulting CBOR blob is decoded:

@[(id: 0, name: "item0"), (id: 1, name: "item1")]

Example: Writing Nested Structures

Objects and arrays can be nested arbitrarily.

Here is the same array of CBOR objects, nested in an envelope containing an additional status field.

Instead of manually placing begin/end pairs, we're using the convenience helpers writeObject and writeArray:

writer.writeObject:
  writer.writeField("status", "ok")
  writer.writeName("data")
  writer.writeArray:
    for i in 0 ..< 2:
      writer.writeObject:
        writer.writeField("id", i)
        writer.writeField("name", "item" & $i)

This produces the following output when the resulting CBOR blob is decoded:

(status: "ok", data: @[(id: 0, name: "item0"), (id: 1, name: "item1")])

Reference

This page provides an overview of the cbor_serialization API - for details, see the API reference.

Parsing

Common API

CBOR parsing uses the common serialization API, supporting both object-based and dynamic CBOR data item:

type
  NimServer = object
    name: string
    port: int

  MixedServer = object
    name: CborValueRef
    port: int

  RawServer = object
    name: CborBytes
    port: CborBytes

let rawCbor = Cbor.encode(NimServer(name: "localhost", port: 42))

# decode into native Nim
let native = Cbor.decode(rawCbor, NimServer)

# decode into mixed Nim + CborValueRef
let mixed = Cbor.decode(rawCbor, MixedServer)

# decode any value into nested cbor raw
let raw = Cbor.decode(rawCbor, RawServer)

# decode any valid CBOR, using the `cbor_serialization` node type
let value = Cbor.decode(rawCbor, CborValueRef)

Standalone Reader

A reader can be created from any faststreams-compatible stream:

var reader = Cbor.Reader.init(memoryInput(rawCbor))
let native2 = reader.readValue(NimServer)

# Overwrite an existing instance
var reader2 = Cbor.Reader.init(memoryInput(rawCbor))
var native3: NimServer
reader2.readValue(native3)

Parser options

Parser options allow you to control the limits of the parser. Set them by passing to Cbor.decode or when initializing the reader:

  rawCbor, NimServer, conf = defaultCborReaderConf(nestedDepthLimit: 0))

Flavors can be used to override the defaults for some these options.

Limits

Parser limits are passed to decode, similar to flags:

You can adjust these defaults to suit your needs:

  • nestedDepthLimit [=512]: Maximum nesting depth for objects and arrays (0 = unlimited).
  • arrayElementsLimit [=0]: Maximum number of array elements (0 = unlimited).
  • objectFieldsLimit [=0]: Maximum number of key-value pairs in an object (0 = unlimited).
  • stringLengthLimit [=0]: Maximum string length in bytes (0 = unlimited).
  • byteStringLengthLimit [=0]: Maximum byte string length in bytes (0 = unlimited).
  • bigNumBytesLimit [=64]: Maximum number of BigNum bytes (0 = unlimited).

Special types

  • CborBytes: Holds a CBOR value as a distinct seq[byte].
  • CborVoid: Skips a valid CBOR value.
  • CborNumber: Holds a CBOR number.
    • Use toInt(n: CborNumber, T: type SomeInteger): Opt[T] to convert it to an integer.
    • The integer field for negative numbers is set to abs(value)-1 as per the CBOR spec. This allows to hold a negative uint64.high value.
  • CborValueRef: Holds any valid CBOR value, it uses CborNumber instead of int.

Writing

Common API

Similar to parsing, the common serialization API is used to produce CBOR data items.

# Convert object to cbor raw
let blob = Cbor.encode(native)

Standalone Writer

var output = memoryOutput()
var writer = Cbor.Writer.init(output)
writer.writeValue(native)
let decoded = Cbor.decode(output.getOutput(seq[byte]), NimServer)
echo decoded

Flavors

Flags and limits are runtime configurations, while a flavor is a compile-time mechanism to prevent conflicts between custom serializers for the same type. For example, a CBOR-RPC-based API might require that numbers are formatted as hex strings while the same type exposed through REST should use a number.

Flavors ensure the compiler selects the correct serializer for each subsystem. Use defaultSerialization to assign serializers of a flavor to a specific type.

# Parameters for `createCborFlavor`:

  FlavorName: untyped
  mimeTypeValue = "application/cbor"
  automaticObjectSerialization = false
  automaticPrimitivesSerialization = true
  requireAllFields = true
  omitOptionalFields = true
  allowUnknownFields = true
  skipNullFields = false
type
  OptionalFields = object
    one: Opt[string]
    two: Option[int]

createCborFlavor OptCbor
OptCbor.defaultSerialization OptionalFields
  • automaticObjectSerialization: enable automatic serialization for all object types.
  • automaticPrimitivesSerialization: enable automatic serialization for all primitive types.
  • allowUnknownFields: Skip unknown fields instead of raising an error.
  • requireAllFields: Raise an error if any required field is missing.
  • omitOptionalFields: Writer ignores fields with null values.
  • skipNullFields: Reader ignores fields with null values.

Custom parsers and writers

Parsing and writing can be customized by providing overloads for the readValue and writeValue functions. Overrides are commonly used with a flavor that prevents automatic serialization, to avoid that some types use the default serialization, should an import be forgotten.

# Custom serializers for MyType should match the following signatures
proc readValue*(r: var Cbor.Reader, v: var MyType) {.raises: [IOError, SerializationError].}
proc writeValue*(w: var Cbor.Writer, v: MyType) {.raises: [IOError].}

# When flavors are used, use the flavor reader/writer instead
proc readValue*(r: var MyFlavor.Reader, v: var MyType) {.raises: [IOError, SerializationError].}
proc writeValue*(w: var MyFlavor.Writer, v: MyType) {.raises: [IOError].}

Objects

Decode objects using the parseObject template. To parse values, use helper functions or readValue. The readObject and readObjectFields iterators are also useful for custom object parsers.

proc readValue*(r: var Cbor.Reader, table: var Table[string, int]) =
  parseObject(r, key):
    table[key] = r.parseInt(int)

Sets and List-like Types

Sets and list/array-like structures can be parsed using the parseArray template, which supports both indexed and non-indexed forms.

Built-in readValue implementations exist for regular seq and array. For set or set-like types, you must provide your own implementation.

type
  HoldArray = object
    data: array[3, int]

  HoldSeq = object
    data: seq[int]

  WelderFlag = enum
    TIG
    MIG
    MMA

  Welder = object
    flags: set[WelderFlag]

proc readValue*(r: var Cbor.Reader, value: var HoldArray) =
  # parseArray with index, `i` can be any valid identifier
  r.parseArray(i):
    value.data[i] = r.parseInt(int)

proc readValue*(r: var Cbor.Reader, value: var HoldSeq) =
  # parseArray without index
  r.parseArray:
    let lastPos = value.data.len
    value.data.setLen(lastPos + 1)
    readValue(r, value.data[lastPos])

proc readValue*(r: var Cbor.Reader, value: var Welder) =
  # populating set also okay
  r.parseArray:
    value.flags.incl r.parseInt(int).WelderFlag

Custom Iterators

Custom iterators provide access to sub-token elements:

customIntValueIt(r: var CborReader; body: untyped)
customNumberValueIt(r: var CborReader; body: untyped)
customStringValueIt(r: var CborReader; limit: untyped; body: untyped)
customStringValueIt(r: var CborReader; body: untyped)

Convenience Iterators

readArray(r: var CborReader, ElemType: typedesc): ElemType
readObjectFields(r: var CborReader, KeyType: type): KeyType
readObjectFields(r: var CborReader): string
readObject(r: var CborReader, KeyType: type, ValueType: type): (KeyType, ValueType)

CborReader Helper Procedures

See the API reference

CborWriter Helper Procedures

See the API reference

Enums

type
  Fruit = enum
    Apple = "Apple"
    Banana = "Banana"

  Drawer = enum
    One
    Two

  Number = enum
    Three = 3
    Four = 4

  Mixed = enum
    Six = 6
    Seven = "Seven"

cbor_serialization automatically detects the expected representation for each enum based on its declaration.

  • Fruit expects string literals.
  • Drawer and Number expect numeric literals.
  • Mixed (with both string and numeric values) is disallowed by default. If the CBOR value does not match the expected style, an exception is raised. You can configure individual enum types:
configureCborDeserialization(
    T: type[enum], allowNumericRepr: static[bool] = false,
    stringNormalizer: static[proc(s: string): string] = strictNormalize)

# Example:
Mixed.configureCborDeserialization(allowNumericRepr = true) # Only at top level

You can also configure enum encoding at the flavor or type level:

type
  EnumRepresentation* = enum
    EnumAsString
    EnumAsNumber
    EnumAsStringifiedNumber

# Examples:

# Flavor level
Cbor.flavorEnumRep(EnumAsString)   # Default flavor, can be called from non-top level
Flavor.flavorEnumRep(EnumAsNumber) # Custom flavor, can be called from non-top level

# Individual enum type, regardless of flavor
Fruit.configureCborSerialization(EnumAsNumber) # Only at top level

# Individual enum type for a specific flavor
MyCbor.flavorEnumRep(Drawer, EnumAsString) # Only at top level

Updating this book

This book is built using mdBook, which in turn requires a recent version of rust and cargo installed.

# Install correct versions of tooling
nimble mdbook

# Run a local mdbook server
mdbook serve docs

A CI job automatically published the book to GitHub Pages.