Build caches - why even bother?

Modern mobile development team’s workflow boils down to feature development followed by that feature being merged to the main SCM branch and then released to production. Before feature implementation is considered mergeable, it must pass various checks including builds, tests and code reviews. The more developers are on the team, the more features they develop, meaning more builds are made. The app gets more complicated, meaning each additional build takes longer. Increased workload and complexity leads to more time spent waiting for feedback, and nobody likes waiting!

Build caches are intended to decrease build time and tightening the feedback loop to allow more work to be done. The tool we describe in this article allowed us to get from 8 minutes of build time to 4 minutes - a 50% decrease with no code changes. This is hardly a limit - better app modularization means more parts can be cached, so we still have work to do.

Problem

At Sweatcoin, we use Pods for native dependency management and React Native for most of our app's screens. Being a small engineering team, we tend to avoid unnecessary dependency updates. Those are time consuming and rarely bring any user value to the product. In the end, if it works - don't change it.

Like most of iOS developers, we use Xcode to build our app. And like any good team nowadays, we use a dedicated machine to build every pull request and make test and production builds. Every build is made as an archive to reproduce production builds’ conditions. During local debug builds, Xcode goes the extra mile to cache and reuse results of previous builds, but archiving is different. Everything gets rebuilt every time an archive is made, despite the fact that most dependencies rarely change. No build cache is in effect during archiving.

So here's the problem - we rebuild too much, wasting CPU cycles and time, waiting for builds to finish.

Solution

To avoid wasting time, we can cache binaries produced by targets that are the app's dependencies, and reuse them in later builds. Since we use Xcode with its’ xcodebuild to make builds, we can try to convince xcodebuild to use our cached products and not rebuild everything from scratch. We're responsible for feeding xcodebuild the right versions of our dependencies; we also need to keep track of what's in the cache and whether it’s up-to-date.

To implement build products’ caching, we need to know:

Which dependencies get built - to understand what to put into the cache.
What is the state of every dependency - to use the right product versions and update the cache when needed.

We can answer the first question using build graphs.
Tasks that xcodebuild needs to perform to build your app form a tree-like structure, defining dependencies between tasks, called a build graph. Some tasks require other tasks' results in order to run successfully. We want to cache products of "native targets" (this is the naming Xcode uses) - tasks which take source files and produce binaries (frameworks and libraries).

Answer to the second question lies in task inputs and build settings.
To understand if we can reuse binaries that are cached we need to understand if they're up to date. To build a binary, we compile source files using build settings which are specified in the project configuration. Binaries are task outputs, whereas source files are task inputs. If some source files change, if we update Xcode or add some Swift compiler flags, we need to re-run the task affected by these changes, and all tasks that depend on that task. In Xcode terms, we need to rebuild native targets.

Implementation

At this point, I hope you get the idea. We construct a build graph, check task states and re-run tasks which have no up-to-date products in our cache. Then we set up the app target so that it can reuse cached products. As always, things get complicated when it comes to actual implementation.

Constructing build graph

Since we want to cache binary dependencies of the app, let me remind you of their kinds.
Binary dependencies are linked to the app's binary - either statically or dynamically. If some binary is built but not linked, either that binary is not a part of the app or project setup is wrong. Statically linked binaries come in two types - static libraries and static frameworks. There's no fundamental difference between the two - static framework is just a bundle (a directory, but with a special structure of which the linker and compilers are aware) that wraps a static library. Similarly, dynamically linked binaries are dynamic libraries (dylibs) and dynamic frameworks. Dynamic framework is a wrapper for dylib, in the same way that a static framework is for a static library.
Once again, we get all these kinds of binaries by building native targets that are listed as app dependencies. Some of the dependencies are direct - the app links them directly; some are transitive - linked by direct dependencies or other transitive dependencies.

Here is a dependency graph for some imaginary app.

Dependency Graph

A is a direct dependency of App, whereas B, C, D and E are all transitive dependencies of App. B has both C and D linked, and A links B and E. If we want to cache everything that App depends on, we need to know about all transitive dependencies that are in the build graph. We can construct a build graph starting at A, enumerating everything A links to, then everything linked to each of A's dependencies, and so on.

Gathering state

This is the most important part of the process we want to implement. You may have heard that "There are only two hard things in Computer Science: cache invalidation and naming things". In our case, dependency state is all about cache invalidation, so be sure it's not that easy.

State of every dependency in the build graph consists of three things:

What is involved in building а binary - content of input files.
The way а binary is built - tools and build settings.
State of its own dependencies.

Xcode has a concept of a "build phase" - if we consider native target a task in the build graph, build phases are subtasks inside of that task. You may be familiar with the "Compile Sources" phase, which obviously compiles source files that are members of the target. Source files are inputs of that phase, and every build phase has inputs of some kind. Inputs of all build phases are involved in building the target's product - the binary which we want to cache and reuse, so we list those inputs to check their contents a bit later.

Build Phase Inputs

We also need to understand how a binary is built. xcodebuild utilizes compilers - such as clang and swiftc - to turn source files into object files. Those object files are put together using linker to form a library or an executable. What matters here is the versions of compilers, linker and other tools involved in the process, and flags which xcodebuild passes to them. We can get this information calling xcodebuild with -showBuildSettings flag. Some of the build settings contain paths like path of derived data directory, which may differ between build machines. We should exclude such paths to let all build machines to reuse cached binaries.

> xcodebuild -project Pods/Pods.xcodeproj -configuration Internal -destination "generic/os=ios" -target Alamofire -derivedDataPath "$HOME/build" -showBuildSettings archive 

Build settings for action archive and target Alamofire:
    ACTION = archive
    ...
    ALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = NO
    ALWAYS_SEARCH_USER_PATHS = NO
    ALWAYS_USE_SEPARATE_HEADERMAPS = NO
    ...
    ARCHS = arm64
    ...
    SDKROOT = /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS12.2.sdk
    SDK_DIR = /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS12.2.sdk
    SDK_DIR_iphoneos12_2 = /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS12.2.sdk
    SDK_NAME = iphoneos12.2
    SDK_NAMES = iphoneos12.2
    SDK_PRODUCT_BUILD_VERSION = 16E226
    SDK_VERSION = 12.2
    SDK_VERSION_ACTUAL = 120200
    SDK_VERSION_MAJOR = 120000
    SDK_VERSION_MINOR = 200
    ...
    SWIFT_VERSION = 4.0
    ...
    TARGET_BUILD_DIR = /Users/user/build/Internal-iphoneos/Alamofire
    ...
    XCODE_APP_SUPPORT_DIR = /Applications/Xcode.app/Contents/Developer/Library/Xcode
    XCODE_PRODUCT_BUILD_VERSION = 10E1001
    XCODE_VERSION_ACTUAL = 1021
    XCODE_VERSION_MAJOR = 1000
    XCODE_VERSION_MINOR = 1020
    ...

Here, we can safely exclude SDKROOT, SDK_DIR, SDK_DIR_iphoneos12_2, TARGET_BUILD_DIR, XCODE_APP_SUPPORT_DIR because their values are only valid for the machine that runs the build. If we update Xcode, we'll see the change in SDK_VERSION, XCODE_PRODUCT_BUILD_VERSION and other similar settings - that's why we don't remove them.

Finally, we combine the contents of build phases' inputs with build settings, and compute SHA-256 hash over that (sometimes giant) pile of bytes. Each target in the graph gets its own SHA. We use these hashes as state indicators for cached products. If no cached product has SHA that we computed, we need to build dependency in its current state and cache the result.

We keep an eye on state of dependencies of each target in the graph by adding SHAs of all its direct dependencies in that mix of input files’ contents and build settings. If something in the build graph changes, that change propagates all the way to the root of the build graph, causing target rebuilds. For example, imagine that we're dealing with the following graph:

Build Graph Before Evaluation

If something changes in C sources, we also need to rebuild B - to check that the change in C doesn't break B, and to relink B with new version of C. This means that binary produced by B is going to change too - so we need to rebuild A as well, for the same reason. We don't want to rebuild D and E - they are up-to-date, and our goal is to eliminate unnecessary rebuilds.

Build Graph After Evaluation

Change of C sources will cause SHA of C to change. If we include SHAs of C and D in data over which SHA of B is computed, change in SHA of C will result to change in SHA of B. Similarly, A can keep track of its dependencies B and E, and change in SHA of C will propagate to SHA of A. That's exactly what we want.

Caching products

Now when we have a build graph and know the state of every dependency in that graph, we can understand which dependencies we can reuse and which we need to rebuild. For every dependency, we take SHA representing its state and check if there is a product with the same SHA in our cache. If there is one - reuse it, otherwise rebuild the dependency.

When we run for the first time, no cache exists. We just run xcodebuild and add every dependency product to the cache. Xcode stores build products in "derived data" directory. We copy those build products, including dSYMs and bcsymbolmap files, to cache and zip them. Cache is a simple directory, which has subdirectory for every dependency that was ever cached. Subdirectories are named the same way that native targets (which are our dependencies, listed in the build graph) are named. Inside every subdirectory we store zip files mentioned above. Each zip file contains build products of single dependency in some state, so zip file names are equal to that state, which we represent by the SHA hash.

Cache Dir Structure

Reusing products

When we need to check if there's a cached product for the current dependency state, we go to the dependency's subdirectory and search for the file with name equal to SHA we obtained during state calculation step. If such file exists, we unzip and place its contents into build_cache subdirectory inside derived data directory.

Dealing with the compiler

Binaries themselves are only used during linking. Compiler doesn't understand binaries, but we can promise it that for everything that we use but don't implement, there's a proper implementation in some binary, so compiler can rest assured and let linker find those implementations. Such promises come in the form of header files for C family languages and swiftmodule files for Swift.

Frameworks contain header files and swiftmodules, so when we unzip cached framework, we're done - everything compiler needs is already there. For libraries, we need to go through some extra hassle. Starting with Xcode 9, you can create Swift static libraries, but they're rarely used in real projects. Most of the projects use static libraries written in C family languages.

Build phases of any target in Xcode may contain one or more "Copy files" phases, each of which has a list of inputs - files to copy and a destination - directory to copy inputs to. For every target that has a library product, we need to check all "Copy files" phases, find header files among their inputs and copy these header files. Where we copy them is a subdirectory inside build_cache directory, path of which equals to "Copy files" phase destination. Then, we add these subdirectories to "Header search paths" and compiler flags of targets that depend on the target whose header files we just copied.

Header Files

Fooling xcodebuild

After unzipping products and copying necessary headers, are we ready to run rebuild? No. Why? By default, every target in our project is rebuilt during archive build, and we need to stop xcodebuild from rebuilding targets that have cached products.

Dependencies in Xcode

Targets may be other targets' dependencies, and Xcode has two kinds of dependencies - explicit and implicit. Explicit dependencies are listed in "Target dependencies" build phase of each target. Implicit dependencies are the ones listed in "Link binary with libraries" build phase and linker flags. There's a "Find implicit dependencies" setting in target's scheme, which controls whether implicit dependencies should be taken into account during build or not. Implicit dependencies are useful when your project structure is complicated - if you have multiple projects as part of a workspace, Xcode won't allow you to add a target from one project as explicit dependency to a target in another project. But you can still add other target's product to "Link binary with libraries" phase and turn on "Find implicit dependencies" setting. Since version 10.2, Xcode is able to find implicit dependencies not only among products you’ve added to "Link binary with libraries", but also among linker flags.

Avoiding rebuilds

When we deal with explicit dependencies, simply removing targets that have a cached product from "Target dependencies" build phases is required, but insufficient. Products of these dependencies are binaries, and binaries are here to be linked. Linking is done either through "Link binary with libraries" phase or through linker flags. Removing targets from "Target dependencies" turns them into implicit dependencies. If the build scheme has implicit dependency search turned on, xcodebuild will still count targets with cached product as implicit dependencies, which leads to rebuilds.

What else can we do?

Delete every mention of targets with cached products everywhere - not only in "Target dependencies", but in "Link binary with libraries" and linker flags too, to avoid these targets being found during implicit dependencies search. There's a downside - cached binaries will not be linked to anything, and that's going to upset the linker and break our build.
Delete targets with cached products from "Target dependencies" only and turn off implicit dependencies search, but that's almost certainly going to break the build too - implicit dependencies search is a global setting for the whole app, and if it's on, it should've been turned on for a reason. We'd like to make our cache work for as many projects as possible, and in general, we can't be sure that turning off implicit dependencies search will work.

Things get even more complicated when we deal with static libraries. Static linking is all about combining binaries in a single file, so if a static library links another static library, xcodebuild uses libtool to merge their contents. libtool is just a program like clang and swiftc - it accepts paths of static libraries which it needs to merge. xcodebuild passes static libraries listed in "Link binary with libraries" phase to libtool. If we delete some static library from "Link binary with libraries" inputs, it will be excluded from libtool params - welcome to the world of "Symbol not found" errors.

Looks like we cannot delete linker flags and we can't touch "Link binary with libraries" phase. Also we cannot turn off implicit dependencies search. Still, we want to reuse products that we cached - we don't want to let Xcode find targets which built those products, and rebuild them from scratch. We can delete targets themselves, and no targets means no rebuilds - simple as that. Don't forget about deleting those targets from "Target dependencies" phases to keep project structure consistent.

Updating the cache

Now, when we have every up-to-date product unzipped and set up, it's time to run xcodebuild to build products for targets that have no up-to-date product in cache. Binaries which we unzipped are reused, and freshly built products are then stored in cache following the process we described earlier, so we can reuse them in later builds.

Wrapping things up

Up until now, we only dealt with dependency targets - not the app itself. We need to pass cached products to the app's target, adding necessary search paths and removing remaining dependency targets. Dynamic framework dependencies require special treatment here - unlike static dependencies, which get their binary code embedded during linking, dynamic frameworks need to be explicitly copied to the app bundle.

Pods use shell script to embed dynamic frameworks and we need to fix file paths in that script. For every dynamic framework we built ourselves, we need to add an entry to "Embed frameworks" build phase of the app's target.

After all that work, we can build the app the way we prefer - using Xcode, xcodebuild or Fastlane (which uses xcodebuild under the hood).

Example

We implemented ideas described above in a tool named XcodeArchiveCache. Let's take a look at what we ended up with. Our tool is wrapped into a Ruby gem - just like Pods and Fastlane. Many developers use either of those, so chances are you already know where to start, but if you're unfamiliar with gems ecosystem, you can install XcodeArchiveCache with gem install xcode-archive-cache command.

Warning

XcodeArchiveCache is still in alpha stage, so bugs may arise. Some diagnostics may seem cryptic, and some errors may result in stack traces. It's not feasible to cover every possible project configuration, so if XcodeArchiveCache doesn't work for you, create an issue in our repo. Also, there are other things to consider if you're going to try XcodeArchiveCache in your own project:

To accomplish what it's intended to do, XcodeArchiveCache changes contents of Xcode projects. Commit every change you have made before running XcodeArchiveCache.
Main intent of XcodeArchiveCache is to speed up CI archive builds - it's not going to fit in build-run-debug workflow.
It only runs on OSX because it requires Xcode.

Trying it out

Clone sample project

We made a project you can play with to check XcodeArchiveCache in action. Clone it from here. Archive builds require signing, so you'll need to specify the team to sign Test target. Commit that change locally because we're going to use git reset --hard numerous times to test caching.

Build it

Simply run:

pod install && time xcodebuild -workspace Test.xcworkspace -configuration Release -destination generic/platform=ios -scheme Test -derivedDataPath build SOME_FLAG=1 -UseModernBuildSystem=NO -archivePath build/test.xcarchive archive | xcpretty

You'll see how long it takes to make an app archive - on my MBP it took around 25 seconds.

Cachefile

XcodeArchiveCache has a simple DSL to describe what to put in the cache. That configuration is stored in a file named Cachefile. cat Cachefile will show you the configuration that our sample project uses:

workspace "Test" do
  configuration "release" do
    build_configuration "Release"
    xcodebuild_args "SOME_FLAG='1' -UseModernBuildSystem=NO"
  end

  derived_data_path "build"

  target "Test" do
    cache "Pods_Test.framework"
    cache "libStaticDependency.a"
  end
end

First, we need to tell the tool which workspace or project it should operate upon - that's done in either workspace "<workspace name>" or project "<project name>" part. Inside that main block we describe what we need to cache and the way to build cached products.

configuration parts are about the way we invoke xcodebuild to build cached products. You can have as many of those as you want, and specify the one to use with --configuration flag during XcodeArchiveCache invocation.

build_configuration tells XcodeArchiveCache which build configuration should be used. By default, Xcode generates Debug and Release.
xcodebuild_args are passed to xcodebuild - note that these are the same flags we passed to xcodebuild in "Build it" part.

derived_data_path is obviously the path where xcodebuild should store it's derived data during dependency builds.
target part defines which dependencies should be cached - Test is our main app's target, and it links Pods_Test.framework and libStaticDependency.a. They, and their direct and transitive dependencies are going to be cached.

Build using cache

Run:

git reset --hard && git clean -fdx && pod install && time xcode-archive-cache inject --configuration=release --storage="$HOME/build_cache"

Since it's the first time we run XcodeArchiveCache, our cache directory is empty, so XcodeArchiveCache is going to build every dependency and put products into cache. That took almost 20 seconds on my machine. Run git diff - some targets vanished from project files, and those are the targets that were parts of build graphs for Pods_Test.framework and libStaticDependency.a. We replaced these targets with cached build products.

Let's check how cache affects app build time:

time xcodebuild -workspace Test.xcworkspace -configuration Release -destination generic/platform=ios -scheme Test -derivedDataPath build SOME_FLAG=1 -UseModernBuildSystem=NO -archivePath build/test.xcarchive archive | xcpretty

This time, build duration on my MBP was 13 seconds. Add 20 seconds that XcodeArchiveCache took to run - it's 33 seconds in total, and it's definitely worse than 25 seconds we've dealt with before. Hold on, our cache is here to be reused.

Rebuild using cache

Run the same two commands once again. This time, cache directory contains some zipped build products, and XcodeArchiveCache is going to rely on them. I got 7 seconds for XcodeArchiveCache run and 11 seconds for xcodebuild. This means that we were able to go from 25 to 18 seconds of build time using our cache.

Does it really work?

We've built our sample app using the cache, but does it really work? Since the app was archived, it's not going to run in a simulator - archive builds only produce ARM binaries. Still, we can install the app on a real device and check if it actually runs as intended.

We need to create an ipa:

cd build/test.xcarchive/Products/Applications && mkdir Payload && mv Test.app Payload/Test.app && zip -r Test.ipa Payload && cp Test.ipa ~/Desktop && cd -

We need to install that ipa to a device: go to Xcode - Window - Devices and Simulators, select the device which you want to install the app to in the left pane, press "plus" button at the bottom, below "Installed Apps" table, and select Test.ipa that's on your Desktop.

Install On Device

Finally, we can launch the app. Don't blame me for the ugly interface - it's just a test app. Contents of the UILabel on top of the screen come from StaticDependency, which in turn takes these strings from its own dependencies - you can check it by diving into the call stack which starts in viewDidLoad method of ViewController. Tap the "Tap me" button - what does it say?

Rebuilding with changes

What happens if we change state of one of our app's dependencies? Let's reset our sample project to its initial state:

git reset --hard && git clean -fdx && pod install

Open Test.xcworkspace in Xcode, then open file named FrameworkThing.m and change the string @“I'm a framework dependency” to @“I'm a CHANGED framework dependency”. Now, it's time to build the app:

time xcode-archive-cache inject --configuration=release --storage="$HOME/build_cache"

It took my MBP 11 seconds to finish this time. Not 7 seconds - the change we introduced caused rebuild of some dependencies. In command's output you can find the following lines right before xcodebuild invocation:

going to rebuild:
StaticDependency, LibraryWithFrameworkDependency, FrameworkDependency

FrameworkThing.m is a source file from FrameworkDependency target. Our change affected SHA of FrameworkDependency. SHA change propagated to LibraryWithFrameworkDependency, and, finally, to StaticDependency. We pass $HOME/build_cache as storage path to XcodeArchiveCache, so you can check the contents of FrameworkDependency, LibraryWithFrameworkDependency and StaticDependency subdirectories of that directory. All of them contain two versions of the build products.

Let's check if this change makes it way to the app bundle:

time xcodebuild -workspace Test.xcworkspace -configuration Release -destination generic/platform=ios -scheme Test -derivedDataPath build SOME_FLAG=1 -UseModernBuildSystem=NO -archivePath build/test.xcarchive archive | xcpretty

Same 11 seconds, 22 seconds in total. Slightly better than initial 25 seconds.
Package the ipa and install it to a device following the steps from "Does it really work?" part. Can you see the difference? Don't forget to tap "Tap me" button, it's there for a reason.

Playground

Try changing pod versions, build settings and file contents in our sample project. Remove that SOME_FLAG=1 part from Cachefile and xcodebuild flags. Turn off -UseModernBuildSystem=NO. Don't forget to reset the project to its initial state every time! (By the way, it's a good idea to forget to do it once and watch what happens.)

git reset --hard && git clean -fdx && pod install

If you find something that looks like a bug or a possible improvement, or XcodeArchiveCache crashes - submit an issue in our repo. Every piece of software has bugs, and to fix them we need to find them first.

Final notes

Configuring and using XcodeArchiveCache

XcodeArchiveCache is open source software available under the MIT license. Project sources are located in this repository.
XcodeArchiveCache only works for binaries produced by native targets. If you think that other types of products, i.e. arbitrary files generated by aggregate targets, should be supported - let us know.
Dependency search work is based on "Target dependencies" and "Link binary with libraries" inputs. If you want to cache some binary which is produced by a native target but isn't listed in either of these phases - move it from linker flags to "Link binary with libraries". That should work in most cases. If you have a project setup that doesn't allow such a change - let us know.
Cache efficiency is directly related to the amount of cache hits - the more times cached product can be reused, the better. If some of your app's dependencies change constantly, it's better to let it be rebuilt every time. Excluding particular targets from cache is not supported, but if you need that feature - send us a request.
Projects that are split up into multiple modules are likely to benefit from XcodeArchiveCache usage. Caching modules looks like a good approach.
Remote cache is planned but not implemented - let us know if you need it. We use single MacMini to run Sweatcoin iOS app builds, so remote cache is not a "must have" feature for us.
If you expect some dependency to be rebuilt, but XcodeArchiveCache logs don’t mention it in going to rebuild list - something is wrong with state checks. Open an issue describing the situation.
Since XcodeArchiveCache is alpha software, we'd suggest to avoid using it for AppStore builds until 1.0 comes out. We use XcodeArchiveCache for pull request builds and in-house test builds, which are the majority of builds we make.
We delete and recreate the build cache with every nightly build.

Submitting issues

One of our project's main goals is to support as many project configurations as possible. I'm sure some pretty wild project setups exist in the world, and it's hard to predict their specifics. Try to use XcodeArchiveCache in your project - if it doesn't work, submit an issue.

Example projects which can be used to reproduce issues are a need - it's always easier to fix something you can reproduce locally.
Searching for existing similar issues is never a waste of time, especially when you're submitting a feature request.
Reading docs before you start is a good idea.

Contributing to XcodeArchiveCache

If you'd like to implement a feature, fix a bug or improve existing code, it's better to open an issue first. This way we can minimize duplicate work and wasted time.

Getting in touch

If you'd like to discuss the tool and ideas behind it, ask about obscure details or get help with XcodeArchiveCache - drop me a line, I'm @ilushkanama on Twitter.