Skip to content

Extracting audio from CAF files without re-encoding

Yesterday, I found myself in possession of some Core Audio Format (CAF) files. CAF is just a container format; it can hold audio data encoded with different formats, such as AIFF or AAC. In this case, the files were holding music encoded with AAC.

QuickTime Player can play CAF files without any trouble, but I wanted to add the music to my iTunes library, and iTunes 11 can’t play them natively. I needed a way to convert them into a format iTunes can handle, like vanilla AAC in an MPEG-4 Audio container (.m4a). Since AAC is a lossy compression, I wanted to extract the audio from the CAF files without re-encoding them and losing some quality.

It turns out that this is really easy with the OS X command line tool afconvert. One would usually use this to create CAF files, but it can unpack them as well with an argument of -d 0:

{ -d | --data } data_format[@sample_rate][/format_flags][#frames_per_packet]
    …
    A format of "0" specifies the same format as the source file,
        with packets copied exactly.

So all you have to do is specify the file format corresponding to the audio data in the CAF file with -f and drop a -d 0 in there, and it’ll just work. For my AAC example, that looks like this:

$ afconvert -v -f m4af -d 0 blah.caf
…
Output file: blah.m4a, 7938113 frames

And now you’ve got an M4A file with identical audio data to the original CAF file. You can verify this by converting both files to WAVE or another lossless format and comparing them:

$ afconvert -f WAVE -d LEI16 -o blah-caf.wav blah.caf
$ afconvert -f WAVE -d LEI16 -o blah-m4a.wav blah.m4a
$ diff -s blah-caf.wav blah-m4a.wav
Files blah-caf.wav and blah-m4a.wav are identical

Piece of cake.

The only potentially tricky part is figuring out what kind of audio data is stored in the CAF file, but you can find this out by opening up the file in QuickTime Player and checking the inspector (⌘I):

screenshot of the inspector for blah.caf, showing Format: AAC, 2 channels, 44100 Hz

… but interpreting the “Format” message could be an issue:

screenshot of the inspector for blah-wav.caf, showing Format: Linear PCM, 16 bit little-endian signed integer, 2 channels, 44100 Hz

(This is WAVE.)

Post a Comment

You must be logged in to post a comment.