Parallelize Your for Loops With GCD

Concurrent Iteration over Cocoa Collections

Foundation comes with a set of handy methods to enumerate the items in a collection, such as enumerateObjectsUsingBlock:. By passing the NSEnumerationConcurrent flag to the more elaborate version of the same method, enumerateObjectsWithOptions:usingBlock:, you can easily parallelize the execution of the blocks that are performed for each element in the collection.

Concurrent Iteration in Grand Central Dispatch

From reading the excellent second issue of, I recently learned that Grand Central Dispatch includes a very similar pattern on a lower level, for those cases where the thing you’re iterating over is not a Cocoa collection: the dispatch_apply() function.

dispatch_apply() executes a given block on a given dispatch queue n times, passing the current index of the iteration to the block. The function then waits for all iterations to complete before returning. If you pass a concurrent queue to dispatch_apply(), you can basically think of the construct as a parallel (and efficient, as the documentation notes) for loop.

Using dispatch_apply() in the right places can potentially provide a huge performance boost. If your code processes images on a per-pixel basis or iterates over other memory areas byte by byte, you should definitely look into this. However, as Daniel Eggert stresses in his article, you should measure the performance impact of the parallelization carefully. It could even turn out to be negative:

How well this works depends a lot on exactly what you’re doing inside that loop.

The work done by the block must be non trivial, otherwise the overhead is too big. Unless the code is bound by computational bandwidth, it is critical that the memory that each work unit needs to read from and write to fits nicely into the cache size. This can have dramatic effects on performance. Code bound by critical sections may not perform well at all.