Reading Cart Chunk with PowerShell

Look at the audio files in any professional radio playout system. They’ll likely be linear WAVE files with the associated metadata stored in the cart chunk format. With this information it’s usually enough to successfully share audio between stations and systems (there are some issues around time markers but they can be worked round).
With that in mind, a recent project required me to read the cart chunk data from an existing playout system using PowerShell. It might not seem an obvious option for this but as the ubiquitous scripting language on the Windows platform, it should be possible and require a lot less work than writing and maintaining a full .NET application.
Before we crack into the code, it’s worth taking a look at how a WAVE file is broken down. At the top level it’s a single “chunk” called the RIFF chunk. The header of this chunk is made up of two 4 byte values – the tag (“RIFF” in this case) and the length of the chunk content. As this is the top level chunk, the length is the length of the rest of the file.
Within this top level chunk, you’ll see a number of smaller chunks. Some are required (e.g. data and fmt), others not so much (e.g. cart and bext). These chunks all use the same header format as the top level chunk. That means it should be simple enough to skip through the file looking for the chunk you want rather than reading the whole file into memory.
If you want a bit more information about the technical details of how the chunks are formatted, check out this site. The fmt and data chunks are of most interest if you’re planning to read or write the audio data from the files.
Anyhow, let’s take a look at the code:


<#
.SYNOPSIS
Reads the cart-chunk data from a WAVE file.
.DESCRIPTION
Checks the file is a valid WAVE file, then looks for the cart chunk. Upon discovery, some of the contents are returned to the user. On failure, $null is returned or an error thrown instead.
.PARAMETER FileName
The path to the WAVE file you wish to extract the contents from.
#>
function Get-CartChunk {
    [CmdletBinding()]
    param (
        [Parameter(Mandatory=$true)]
        [string] $FileName
    )
    PROCESS {
        # Settings
        $HEADER_LENGTH = 8
        $HEADER_FIELD_LENGTH = 4
        $encoder = [System.Text.Encoding]::UTF7
        # Check the file name
        if (!(Test-Path $FileName)) {
            throw "You must supply a valid filename. $($FileName) is not valid."
        }
        # Read in as a binary
        Write-Host "Reading in $($FileName)..."
        $headerBuffer = New-Object byte[] $HEADER_LENGTH
        $stream = [System.IO.File]::OpenRead($FileName)
        if ($stream.Read($headerBuffer, 0, $HEADER_LENGTH) -ne $HEADER_LENGTH) {
            $stream.close()
            throw("File is not long enough to be a WAVE file.")
        }
        # Ensure it's wave
        $chunkType = $encoder.getString($headerBuffer[0..($HEADER_FIELD_LENGTH - 1)])
        if ($chunkType -eq "RIFF") {
            Write-Host("$($FileName) is a well formatted WAVE file.")
        } else {
            $stream.close()
            throw("The file is not a WAVE file.")
        }
        # Look for the cart chunk
        $currentOffset = $HEADER_LENGTH + $HEADER_FIELD_LENGTH
        $endOfFile = $false
        Do {
            # Read in the next chunk
            $seek = $stream.Seek($currentOffset, [System.IO.SeekOrigin]::Begin)
            if ($stream.Read($headerBuffer, 0, $HEADER_LENGTH) -ne $HEADER_LENGTH) {
                $stream.close()
                throw("Ran into a problem reading the next chunk.")
            }
            # Calculate the jump early - we need it in two places
            $jumpBytes = $headerBuffer[$HEADER_FIELD_LENGTH..($HEADER_LENGTH - 1)]
            $jump = [bitconverter]::ToInt32($jumpBytes, 0)
            # Check the chunk type
            $chunkType = $encoder.getString($headerBuffer[0..($HEADER_FIELD_LENGTH - 1)])
            Write-Host "Found the $($chunkType) chunk."
            if ($chunkType -eq "cart") {
                # Read in the raw cart chunk
                $cartChunkBuffer = New-Object byte[] $jump
                $currentOffset = $currentOffset + $HEADER_LENGTH
                $seek = $stream.Seek($currentOffset, [System.IO.SeekOrigin]::Begin)
                if ($stream.Read($cartChunkBuffer, 0, $jump) -ne $jump) {
                    $stream.close()
                    throw "Failed to successfully read the cart chunk."
                }
                # Now read in some properties
                $cartChunk = New-Object System.Object
                $cartChunk | Add-Member -Type NoteProperty -Name Title -Value $encoder.getString($cartChunkBuffer[4..67]).Trim()
                $cartChunk | Add-Member -Type NoteProperty -Name Artist -Value $encoder.getString($cartChunkBuffer[68..131]).Trim()
                # Cleanup
                $stream.close()
                return $cartChunk
            }
            # Move onto the next section
            $currentOffset = $currentOffset + $HEADER_LENGTH + $jump
            $endOfFile = ($currentOffset -ge $stream.Length)
        } Until ($endOfFile)
        # Cleanup
        Write-Host "No cart chunk found."
        $stream.close()
        return $null
    }
}

That’s the entire thing ready to go. Admittedly it only reads the title and artist fields but it wouldn’t take much to extend it into any of the other fields you need.
Either way, let’s take a closer look. One of the first lines to jump out would be:
$encoder = [System.Text.Encoding]::UTF7
The WAVE format (and cart chunk) is old enough that it’s specified the fields should be ASCII format. As that has no support for accented characters, you’ll often see UTF7 encoding used instead. This is one of those real world vs. specification things.
A little further down you’ll see we read the file in as a binary and look for the RIFF tag we talked about earlier.

# Read in as a binary
        Write-Host "Reading in $($FileName)..."
        $headerBuffer = New-Object byte[] $HEADER_LENGTH
        $stream = [System.IO.File]::OpenRead($FileName)
        if ($stream.Read($headerBuffer, 0, $HEADER_LENGTH) -ne $HEADER_LENGTH) {
            $stream.close()
            throw("File is not long enough to be a WAVE file.")
        }
        # Ensure it's wave
        $chunkType = $encoder.getString($headerBuffer[0..($HEADER_FIELD_LENGTH - 1)])
        if ($chunkType -eq "RIFF") {
            Write-Host("$($FileName) is a well formatted WAVE file.")
        } else {
            $stream.close()
            throw("The file is not a WAVE file.")
        }

Assuming we’re all good, we shift past the initial header and enter the main loop. This loop is constructed so that we check every chunk in the file until we see the one we want. The location of the cart chunk in a WAVE file is not explicitly defined. You’ll find that playout systems vary between placing it ahead of and after the audio data.
This is one of the reasons skipping through the file rather than reading it in wholesale is a nicer approach. While we’re on the topic of skipping through the file, we calculate the length of our next skip using the following code:

# Calculate the jump early - we need it in two places
            $jumpBytes = $headerBuffer[$HEADER_FIELD_LENGTH..($HEADER_LENGTH - 1)]
            $jump = [bitconverter]::ToInt32($jumpBytes, 0)

This basically takes the last four bytes of the header and converts them to an integer number.
Once we find the cart chunk, we can look at extracting the contents. For this we need a bigger buffer.

if ($chunkType -eq "cart") {
                # Read in the raw cart chunk
                $cartChunkBuffer = New-Object byte[] $jump
                $currentOffset = $currentOffset + $HEADER_LENGTH
                $seek = $stream.Seek($currentOffset, [System.IO.SeekOrigin]::Begin)
                if ($stream.Read($cartChunkBuffer, 0, $jump) -ne $jump) {
                    $stream.close()
                    throw "Failed to successfully read the cart chunk."
                }
                # Now read in some properties
                $cartChunk = New-Object System.Object
                $cartChunk | Add-Member -Type NoteProperty -Name Title -Value $encoder.getString($cartChunkBuffer[4..67]).Trim()
                $cartChunk | Add-Member -Type NoteProperty -Name Artist -Value $encoder.getString($cartChunkBuffer[68..131]).Trim()
                # Cleanup
                $stream.close()
                return $cartChunk
            }

From this bigger buffer we can now read in the cart chunk contents. In this example, we’re only extracting the artist and title which we then present back to the user as an object. It’s here that you’ll want to add any code of processing further fields.
And that’s all you need to read cart chunk in PowerShell. Turns out it’s much simpler than I thought it would be.

You may also like...

7 Responses

  1. Nathan Wood says:

    I think you’re the only person on the entire planet (besides myself) that needs ability to read and write cart chunk info via command line. Wish this was written in Bash. One of my main problems right now is reading and writing cart chunk information using BASH shell scripting using FreeBSD or MacOS / Linux. Would love some help with this. Very challenging.

    • marc says:

      I’ve only just spotted this comment. Apologies!
      The day job has got me writing a lot of PowerShell, Python and the likes in the SDN world but I do, weirdly, have some code written for reading/writing cart chunk in TypeScript/JavaScript. It would be a real challenge in bash but not impossible. That one’s stirred up vague memories of an audio processor or encoder that used the stdin/out pipeline.
      So long as the files are good, it’s all reading at fixed offsets. This website is a good starting point on the WAVE file format. Cart chunk is an AES standard and needs payment to get the docs but can be found with a bit of sleuthing.

  2. Marcos Sueiro says:

    Loved this post. I have been trying to modify it to read the MD5 chunk but I get weird encoding such as

    Found the fmt chunk.
    Found the data chunk.
    Found the MD5 chunk.
    H?<?'??

    Any ideas?

    MD5 chunk: https://mediaarea.net/BWFMetaEdit/md5

    • Marc Steele says:

      Encoding in WAVE files… it’s usually UTF-8 or ASCII from the files I’ve seen. The MD5 chunk is a new one on me… best guess is you’re getting a raw binary value and need to display it as hexadecimal.

      As it’s been a while since I touched Powershell, I’ll defer to Stack Overflow for encoding advice.

      Beyond that, there’s an Open Source Python library I released more recently. The pack/unpack stuff might give you some inspiration about handling the encoding.

      • Marcos Sueiro says:

        Thank you Marc! I will take a look at the info provided. The MD5 value is alphanumeric (e.g. 9DD63E60B8513C27803C98486D992C9D); unconverted, I get what look like 4 bytes (e.g. {72, 152, 60, 128…}). I will give it another shot and report! Below is a typical trace as exported by BWF MetaEdit.

        00000000 0013FBAE WAVE
        0000000C 00000010 fmt
        00000024 0013CC96 data
        0013CCC2 00000010 MD5
        0013CCDA 00000276 bext
        0013CF58 00000739 iXML
        0013D69A 00002513 _PMX

        Thanks again!

        • Marc Steele says:

          If I’m understanding the trace correctly, the MD5 tag is offset 0x0013CCC2 and is 16 bytes long. That matches the 128 bit binary representation of an MD5 checksum. Looks like it just needs reading in as a raw number/blob and converting to a hexidecimal string if you want to print it out.

          • Marcos Sueiro says:

            Changing the code to that below generates a human-readable value. Now I have to figure out why it is giving me a different value than BWF MetaEdit, but that’s another story…. Thanks again Marc!

            # Now read in some properties
            $md5Chunk = New-Object System.Object
            $md5Binary = $md5ChunkBuffer[4..127]
            $md5Hex = ($md5Binary | ForEach-Object { ‘{0:X2}’ -f $_ }) -join ”

            $md5Chunk | Add-Member -Type NoteProperty -Name MD5 -Value $md5Hex