Skip to content

Instantly share code, notes, and snippets.

@minif
Last active April 3, 2025 04:31
Show Gist options
  • Save minif/d16a33480a0e46552db3078ff61411b8 to your computer and use it in GitHub Desktop.
Save minif/d16a33480a0e46552db3078ff61411b8 to your computer and use it in GitHub Desktop.

iOS App Hashing

An important aspect of video game preservation is collecting hashes of ROM files or disc dumps. A hash provides a "unique fingerprint" of the game that can be used to identify a game. Databases such as no-intro.org and redump.org exist to collect and verify these hashes. I am interested in iOS app preservation and so I have decided to write what I know and what I think would be needed to acheive something similar for iOS.

Hashing what?

ROM files and disc dumps can easily be hashed, because all cartriges and discs often contain byte-for-byte identical copies of data. However, iOS apps are distributed digitally, and can be downloaded from different places (App Store, iTunes, 3rd party tools, Apple Configurator, etc). iOS apps are folders with the .app extention which contains multiple files and thus are tricky to hash. iOS apps are packaged and distributed in .ipa files, which is a single file that can be hashed. Given the prevelance of the .ipa file, this is the most likely contender for what to take a hash of. It is important to note that .ipa files are zip files and thus the file can change depending on the way the file has been compressed.

Decrypted iOS apps

Encryption is a common tactic used in modern videogames, and is also implemented in iOS although only in the binary file. This can be easily differentiated by viewing the mach-o for the LC_ENCRYPTION_INFO flag.

Decryption, combined with the nature of .ipa files being zip files make hashing decrypted ipa files very difficult.

  • The mach-o binary may differ between different decryption programs (decryption programs may omit unneeded parts)
  • Decryption often only includes arches compatible with the device decrypting (Limitation of the decryption process). Therefore arches can be missing. (I'll call this "Decryption thinning")
  • Decryption programs add, modify, and remove files such as info.plist, signature files, etc.
  • File metadata, particularly timestamps will affect the hash of the .ipa file.
  • Possibly more differences I can't think of

Unless all of these differences can be documented and accounted for, a single hash of a decrypted .ipa file is not feasable for consistency. In iOS there is no easy way to curcumvent encryption, so apps are commonly distributed decrypted, and encrypted copies are not kept. Therefore, hashing decrypted apps is still important to consider.

Encrypted iOS apps

iOS apps are always distrubuted by Apple as encrypted. However, there can still be differences between encrypted .ipa files

  • App thinning, introduced with iOS 9, strips binaries not used by the device it will run on. Apps downloaded using iTunes and 3rd party programs are FAT binaries.
  • Apple ID specific information is stored in a .sinf file in the SC_Info folder to serve as DRM. This is added after the iOS app is downloaded by all official and common third-party iOS app downloading tools.
  • iTunesMetadata.plist is added, which differs for a variety of reasons Unlike decrypted iOS apps, these differences can be accounted for easier.

The iTunes API

When learning how the iTunes API works, I learned that the API responds with a URL to the app distrubuted by the CDN, as well as the data of the .sinf and the iTunesMetadata.plist files. Given that apps download using iTunes are always FAT, all the ways I can think an encrypted .ipa file could differ is accounted for when downloading directly from the CDN. Therefore, I believe this is the most likely way an iOS app can be hashed consistently.

The URL provided in the response is something like https://iosapps.itunes.apple.com/itunes-assets/Purple115/v4/60/48/d4/6048d45e-89e7-fdda-cc26-f6aac656aca9/pre-thinned7321308915272980610.lc.4959584538784029.55QRSRLFALWWS.signed.dpkg.ipa?accessKey=[big long key]. It appears as though the accessKey expires after a period of time. The mention of pre-thinned is interesting because apps are supposed to be FAT, and no information on the arcitecture of the system was provided. Testing will need to be done to make sure the API downloads .ipa files with consistent hashes.

It may also be possible to reproduce this format of .ipa file using an iTunes .ipa file by accounting for all the modifications iTunes performes, and by experimenting to find a way to reproduce the compression done by Apple when packinging .ipa files for the CDN.

Different ways to determine hashes

As mentioned before, iOS apps are commonly distributed decrypted, and so other methods should be considered for finding a consistent way to hash the same app. A possible idea would be to hash all files in the app individually. Decryption only modifies the mach-o binary and so resources should remain the same. Any modifications to resources, including info.plist can consider the entire dump bad. The decrypted mach-o can be sliced for each individual arch to produce consistent hashes. This accounts for app thinning and decryption thinning.

The only issue with this approach is that tracking a hash for each file requires lots of hashes to be stored, especially for apps with lots of files. There is a high likelyhood that aspects of the mach-o are not preserved as well. Despite these issues, I believe it is still important to consider.

If the compression for the iTunes CDN can be reproduced, it may also be possible to accept decrypted .ipa files with strict standards. The decrypted mach-o will need to have its metadata copied from the encrypted binary in order to make the hash not dependent on the time. .ipa files not following this approach will be considered bad. This approach has the downside of being widely inaccessable, as very specific iOS devices on iOS versions need to be used to avoid decryption thinning.

Conclusion

As of now, these are my thoughts. I am hoping someone interested in creating a no-intro style database for iOS apps finds this information of use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment