CipherKit reproducible builds


alberti cipher disk

We have been on a kick recently with making our build process support “reproducible builds” aka “deterministic builds”. What is this reproducible thing? Basically, what that means is that you can run a script and end up with the exact same binary file as our official releases, be it a APK, JAR, AAR, whatever. That lets anyone verify that our releases are produced only from the source in git, without including anything else, whether deliberately or accidentally (like malware).

Our core CipherKit libraries are the more sensitive areas, so that’s where we’ve started. We generally work on Debian and Ubuntu and recommend that platform, but we recognized that OSX is a popular platform for Android developers also. So this process will work on OSX too, using your favorite package manager (e.g. Fink, MacPorts, or Homebrew).

Then you will end up with IOCipher-v0.3.zip, which includes the .jar and .so files. If your setup is close enough to our release build setup, the contents of that ZIP file will be the same as the official release. Right now, it is difficult to get the exact same binary file (e.g. the same sha256 sum) because of the timestamps in the .zip and varitions caused by using different versions of Java, and Android SDK and NDK. To check the contents of your build versus the official release:

sudo apt-get install faketime unzip wget meld
cd /tmp
wget https://guardianproject.info/releases/IOCipher-v0.3.zip
wget https://guardianproject.info/releases/IOCipher-v0.3.zip.sig
gpg --verify IOCipher-v0.3.zip.sig
git clone https://github.com/guardianproject/IOCipher
cd IOCipher
git checkout v0.3
./make-release-build
./compare-to-official-release IOCipher-v0.3.zip /tmp/IOCipher-v0.3.zip

What is happening here?

meld (FileMerge on OSX) will show a listing of all files listed, and which ones are different. You can see that the contents of the .class files and .so files all match, but there will be inevitable differences in some of the metadata. Native builds are much more sensitive to changes in the toolchain. The Java .class files are usually reproducible even when using different versions of Java and the Android SDK. Native builds are almost never reproducible if the NDK version is at all different. Sometimes even the host platform where the NDK is running (e.g. Ubuntu vs OSX, or 64-bit vs 32-bit) will cause differences in the final binaries.

The NDK version and build platform are listed in the manifest.

The NDK version and build platform are listed in the manifest.

The Java .class files are exactly the same, but the native .so files are not.

The Java .class files are exactly the same, but the native .so files are not.

faketime

Timestamps are a very common issue when trying to reproduce a build. The release build process uses faketime to provide consistent timestamps, which are picked from the git commit. faketime freezes the clock entirely for native builds, so any timestamps from that process will always be exactly the same. Unfortunately, some parts of the ant Java build rely on the clock moving forward, so freezing clock makes the build freeze forever. Instead, faketime sets the clock using the time from the git commit, then moves time forward at 5% of the normal speed. That makes it much more likely that the timestamps will be the same, but usually what seems to happen is that the timestamps are 2 seconds off, which is the time resolution of the ZIP format. A better solution is needed here for JARs, they are easiest to verify using a sha256 sum. JAR signatures mostly seem not worth the pain they introduce. APKs signatures do not sign the whole APK, only the contents, so the varying timestamps do not matter when verifying using a APK signature. Another example of a difference: if comparing a debug build to a release build, then BuildConfig.class will be difference because of the debug stuff. The sort order of the metadata in the jar MANIFEST.MF might also be different.

The end goal

Reproducing builds is an arcane process, for sure. It is a means to an end. The goal is to get to the point where well known binaries, published in places like MavenCentral or jCenter, can easily be verified by anyone who cares to try. Or people could even set up servers that automatically try to reproduce any JAR used in a project.

Then people can verify those JARs in a fully decentralized manner, and publish certifications in their preferred format (GPG signatures, SHA256 sums for gradle-witness, etc). Then we can feel safe getting the release from anywhere on the internet, no matter the level of security or malware infestation.

Towards that goal, we have been getting our libraries all nicely packaged up and submitted to jCenter (the default gradle repository for Android). Here are the relevant bits to include in your build.gradle:

compile 'info.guardianproject.cacheword:cachewordlib:0.1'
compile 'info.guardianproject.iocipher:IOCipher:0.3'
compile 'info.guardianproject.netcipher:netcipher:1.2'
compile 'info.guardianproject.trustedintents:trustedintents:0.0'

compile 'net.freehaven.tor.control:jtorctl:0.2'

SQLCipher-for-Android is coming soon:
https://github.com/sqlcipher/android-database-sqlcipher/pull/197
I hope to also get them up on MavenCentral as well, since that one is also quite common on Android, and is a community run resource versus Bintray’s jCenter, which is purely a for-profit company.