TopLevelEncoder and TopLevelDecoder in Combine

Updates:

  1. Feb 1, 2020
    Added that TopLevelDecoder and TopLevelEncoder might move down into the standard library in a future Swift release.

Speaking of the Combine framework, I find it interesting that Combine had to introduce a formal concept for top-level decoders in order to implement its decode and encode operators. These operators use Swift’s Codable system to decode/​encode values received from an upstream publisher.

Example

I’m going to focus on decode for the rest of this article, but the same applies to encode. decode is often used in a chain that starts with a URLSession data task publisher. Here’s an example you can paste into an Xcode playground:

import Combine
import Foundation

struct User: Codable {
  var id: Int
  var name: String
  var email: String
}

let url = URL(string: "https://jsonplaceholder.typicode.com/users/1")!

let request = URLSession.shared.dataTaskPublisher(for: url)
  .map { data, _ in data }
  .decode(type: User.self, decoder: JSONDecoder())
  // Proper error handling omitted
  .mapError { error in fatalError("\(error)") }
  .sink { user in
    print(user)
  }

import PlaygroundSupport
PlaygroundPage.current.needsIndefiniteExecution = true

This downloads some JSON data and prints a decoded value to the console:

User(id: 1, name: "Leanne Graham", email: "Sincere@april.biz")

Input and output types for decode

The decoding process produces a value of the specified item type (User in the example), so decode must return a publisher whose output type is that item type (which must conform to Decodable). So far, so good.

But what should decode’s input type be? In other words, what is the type of the values decode receives from the upstream publisher? If you think it should be Data, you’d be right most of the time because the commonly used decoders all take Data as their input. But that’s not a given.

There is no formal interface for top-level decoders

The problem is that this isn’t formalized anywhere. In fact, neither the standard library nor Foundation have any formal notion of so-called top-level decoders at all. By “top-level decoder” I mean a type developers use when they want to decode something. The two built-in top-level decoders, JSONDecoder and PropertyListDecoder, happen to provide the same decoding API:

struct JSONDecoder {
  func decode<T: Decodable>(_ type: T.Type, from data: Data) throws -> T
}

struct PropertyListDecoder {
  func decode<T: Decodable>(_ type: T.Type, from data: Data) throws -> T
}

But there’s no protocol in the standard library that requires this interface, and other decoders may well choose a different API. It’s easy to imagine a JSON decoder that reads the JSON data directly from a file, for example.

The team that developed the Codable system in 2017 found that introducing another pair of protocols for top-level encoders and decoders — alongside the existing Encoder and Decoder protocols, which provide the APIs that are meant to be called by codable types during the en-/decoding process — would be too confusing.

TopLevelDecoder and TopLevelEncoder in Combine

But now Combine has a need for those top-level protocols and defines them as follows:

protocol TopLevelEncoder {
  associatedtype Output
  func encode<T: Encodable>(_ value: T) throws -> Output
}

protocol TopLevelDecoder {
  associatedtype Input
  func decode<T: Decodable>(_ type: T.Type, from: Input) throws -> T
}

Notice that the decode method in TopLevelDecoder has the same shape as the ones in JSONDecoder and PropertyListDecoder, but the associated type allows conforming types to freely choose their input type, not limiting all coders to Data.

Combine then retroactively extends the built-in coders to conform them to the new protocols:

extension JSONEncoder: TopLevelEncoder {}
extension JSONDecoder: TopLevelDecoder {}
extension PropertyListEncoder: TopLevelEncoder {}
extension PropertyListDecoder: TopLevelDecoder {}

This finally allows Combine to define a generic decode method that is generic over a TopLevelDecoder:

extension Publisher {
  func decode<Item, Coder>(type: Item.Type, decoder: Coder)
    -> Publishers.Decode<Self, Item, Coder>
    where Item: Decodable, Coder: TopLevelDecoder, 
      Self.Output == Coder.Input
}

Observe the last constraint, Self.Output == Coder.Input. By constraining the decoder’s input type to the upstream publisher’s output type, the method ensures that the passed-in decoder can handle the encoded data received from upstream.

If you use decode with a JSONDecoder, this means the upstream publisher must emit Data values – observe the .map { data, _ in data } step in the example at the beginning to massage the values emitted by the URLSession publisher. But again, if you have a decoder that works with some type other than Data, it will also work with Combine’s decode method, as long as you add the conformance to TopLevelDecoder.

And Publisher.encode does the same thing with TopLevelEncoder:

extension Publisher where Output: Encodable {
    func encode<Coder>(encoder: Coder) -> Publishers.Encode<Self, Coder>
      where Coder: TopLevelEncoder
}

Here, the connection between the encoder’s output type and the downstream output type is made in the publisher type Publishers.Encode, which makes the encoder’s output type its own output type.

Update February 1, 2020: TopLevelDecoder and TopLevelEncoder could potentially move down into the standard library in a future Swift/SDK release. It can be done in a non-ABI-breaking way, and Tony Parker from the Foundation/Combine team supports the idea.