It’s wrappers all the way down

Suppose we have a string that ends in a specific pattern, like this one:

let str = "abc,def,…[many fields]…,1234"

We want to extract the last field, i.e. the substring after the last comma, i.e. "1234".

Splitting the entire string

One solution might be to split the string at every comma and take the last element of the resulting array:

if let lastField = str.split(separator: ",").last {
    // lastField is "1234" ✅
}

But this is quite wasteful if the string is potentially very long.

Finding the index of the last comma

It would be more efficient to find the index of last comma in the string and then take a substring from there to the end. It makes sense to start the search from the back, not only for performance reasons (assume the last field is known to be short), but also because the first portion of the string may contain commas we’re not interested in.

lastIndex(of:)

Solving this task is trivial in Swift 4.2, thanks to the new last​Index​(of:) method, introduced with Swift Evolution proposal SE-0204:

if let lastCommaIndex = str.lastIndex(of: ",") {
    let lastField = str[lastCommaIndex...].dropFirst()
    // → "1234" ✅
}

Note the call to dropFirst. The substring str[lastCommaIndex...] includes the comma.

Manual implementation

But let’s assume for a minute that lastIndex(of:) isn’t available to us. We could write a loop to accomplish the task, manually traversing through the string (from the back) until we find a match.

Let’s wrap the algorithm in an extension on String to make it reusable:

extension String {
    func lastIndex1(of char: Character) -> String.Index? {
        var cursor = endIndex
        while cursor > startIndex {
            formIndex(before: &cursor)
            if self[cursor] == char {
                return cursor
            }
        }
        return nil
    }
}

This code is essentially the same as the standard library’s implementation for last​Index​(of:), if not quite as generic (the standard library version is implemented on Bidirectional​Collection). And it does the job:

if let lastCommaIndex = str.lastIndex1(of: ",") {
    let lastField = str[lastCommaIndex...].dropFirst()
    // → "1234" ✅
}

By composing reversed and firstIndex(of:)

This is all well and good, but there’s another way to solve this task, and it’s what I want to talk about in this article — not because it’s a particularly practical example, but because I believe it can teach a lot about the design of the Swift standard library.

We could also implement our own version of last​Index​(of:) by composing other Collection APIs as follows:

extension String {
    func lastIndex2(of char: Character) -> String.Index? {
        guard let reversedIndex = reversed().firstIndex(of: char) else {
            return nil
        }
        return index(before: reversedIndex.base)
    }
}

It may not be immediately apparent why this works, so let’s go through it step by step. The guard statement reverses the string before looking up the index of the search character:

guard let reversedIndex = reversed().firstIndex(of: char)

Reversing the string makes intuitive sense because we want to search it from the back, but it raises two questions:

  1. Isn’t reversing the whole string expensive?

  2. What can we do with an index into the reversed string? Is there a way to turn it back into an index for the original string?

1. Isn’t reversing the whole string expensive?

Answer: no, because reversed is implemented lazily. The reversed method doesn’t perform any work. All it does is wrap its input in a Reversed​Collection value:

extension BidirectionalCollection {
    func reversed2() -> ReversedCollection2<Self> {
        return ReversedCollection2(base: self)
    }
}

(The code I show here and in the following is my own reimplementation of ReversedCollection. I named it Reversed​Collection2 to avoid conflicts with the standard library type, but both implementations are more or less the same.)

At its most basic, Reversed​Collection is a simple generic wrapper for any Bidirectional​Collection:

struct ReversedCollection2<Base: BidirectionalCollection> {
    var base: Base
}

Even if the input string is many megabytes long, calling reversed() on it is effectively free because String uses copy-on-write. The actual string data isn’t copied until one of the copies is mutated.

Conforming to Collection

Next, the code calls first​Index​(of:)1 on the Reversed​Collection. first​Index​(of:) is part of the Collection protocol, so Reversed​Collection must conform to Collection. Let’s walk through a reimplementation of this conformance.

Index type

Every Collection must have an index type. For Reversed​Collection, we can use the same idea for our index that we used for the collection itself: the index type should be a simple wrapper around the base collection’s index:

extension ReversedCollection2 {
    struct Index {
        var base: Base.Index
    }
}

Collection indices must be Comparable, so we have to add that conformance as well. Since we know that the base value is also Comparable, we can forward to it in our implementation. But note that we have to invert the logic since our index type must model the reversed relationship: the greatest base index represents the smallest reversed index, and vice versa:

extension ReversedCollection2.Index: Comparable {
    static func <(lhs: ReversedCollection2.Index, rhs: ReversedCollection2.Index) -> Bool {
        // Inverted logic compared to base
        return lhs.base > rhs.base
    }
}

Collection implementation

Having laid the groundwork, we can add the Collection conformance. The idea is to base our implementation on the base collection, remembering that we have to invert everything: the reversed collection’s startIndex is the base’s endIndex (and vice versa), index(after:) should call base​.index​(before:), and so on.

extension ReversedCollection2: Collection {
    typealias Element = Base.Element

    var startIndex: Index {
        return Index(base: base.endIndex)
    }

    var endIndex: Index {
        return Index(base: base.startIndex)
    }

    func index(after idx: Index) -> Index {
        return Index(base: base.index(before: idx.base))
    }

    subscript(position: Index) -> Element {
        return base[base.index(before: position.base)]
    }
}

By the way, our reliance on base​.index​(before:) is the reason why Reversed​Collection’s generic parameter must be constrained to Bidirectional​Collection. A plain Collection has no index(before:) method and thus cannot be traversed in reverse.

There’s one more thing to consider: a collection’s endIndex signifies the “one past the end” position, not the position of the last element. By using base​.endIndex as the reversed collection’s startIndex, we have effectively shifted all reversed indices one element to the right. This is why the subscript implementation accesses the element at base.index(before: position.base) rather than position.base.

It’s wrappers all the way down

Going back to the snippet we were discussing in our lastIndex2 implementation, the expression:

str.reversed().firstIndex(of: char)
// type: Reversed​Collection​<String>​.Index?

returns an optional Reversed​Collection​<String>​.Index, which as we’ve seen is a wrapper around the corresponding index in the unreversed string.

2. Is there a way to turn it back into an index for the original string?

Answer: yes, because the standard library implements Reversed​Collection pretty much in the same way I have shown here, only with a few more bells and whistles. Read the code here.

And luckily, Reversed​Collection​.Index​.base is public2, which allows us to get back from the reversed index an index in the underlying base collection. Just as above, we have to remember to shift the index back, as the documentation reminds us:

To find the position that corresponds with this index in the original, underlying collection, use that collection’s index(before:) method with the base property.

Which is why the last line of our lastIndex2 implementation returns the index before the match:

return index(before: reversedIndex.base)

Conclusion

Arguably, the above is a pointless exercise without much practical relevance. However, I believe that understanding how and why this works can teach you a lot about the design of collection types and protocol hierarchy in the standard library. The Sequence and Collection APIs are designed to be composable, and to provide useful operations for implementing your own algorithms. Dave Abrahams’s WWDC 2018 talk Embracing Algorithms gives some great examples how to do this. It’s my favorite WWDC 2018 session.

Reversed​Collection isn’t the only lazy sequence/collection wrapper in the standard library. enumerated, joined, and zip all work in exactly the same way.

And by using lazy you can apply the same general pattern of returning a wrapper value that defers the actual work to map, filter, and other normally eager operations — the difference is that these wrappers store not only the base collection, but also the transformation function you pass in.

Why are some operation always lazy and others eager by default? The standard library’s policy is to make things lazy by default if the laziness is unambiguously free. If, on the other hand, there’s a significant downside to the lazy implementation, provide the lazy version under the lazy property. Ben Cohen:

For example, lazy.map performs the mapping every time you iterate, which could be very costly on multiple iteration, and also captures and stores the mapping closure, which can be risky if it is stateful.

  1. firstIndex(of:) was called index(of:) before Swift 4.2↩︎

  2. ReversedCollection​._base is public, too, though hidden from code completion. ↩︎