Sonos Music API - Getting Started Guide

Encrypting content for Digital Rights Management (DRM)

Sonos supports streaming media encryption you can add to secure specific content for your digital rights management (DRM) needs. You can choose from three levels of encryption for your implementation, where each level builds upon the previous level for increased security.

Overview

This section provides an overview of some encryption concepts and the following levels of content encryption your service can provide.

  1. HTTPS only.
  2. Basic encryption: Add your encrypted content.
  3. Strong encryption: Add Sonos player verification and encrypt your content keys.
     

If your service uses HTTPS to serve content, it automatically uses encrypted SSL for all data transferred between Sonos and your service, such as when Sonos players request content with getMediaURI. This may be fully adequate for your encryption needs.

Basic encryption

For basic encryption, you can encrypt your content and then have your service provide Sonos players access to the encrypted content and the content key to decrypt it. Basic encryption uses secret key encryption.

Secret key encryption uses the same key to both encrypt and decrypt data.

To enable this level of encryption, you need to enable device certificates, which we'll talk about later in the Encryption Items section. A Sonos player requests an encrypted track with getMediaURI. A player uses getContentKey to request a content key for encrypted HTTP live streaming (HLS) content. Your service returns the content key, which the Sonos player uses to decrypt the content it gets from your content server. For more about HLS, see HTTP Live Streaming (HLS).

Important: ensure your content is encrypted using a type of secret key encryption routine that Sonos supports, such as AES-ECB or AES-CBC. See the contentKey description of getMediaURI or getContentKey for details.

Strong Encryption

For strong encryption, you encrypt the content keys sent to players and verify that the Sonos players are legitimate, in addition to basic encryption of your content. We use public key encryption to provide verification of the Sonos players.

Public key encryption uses a public key to encrypt and a private key known only to the recipient to decrypt.

Content encryption uses the unique device certificate on each player for public key encryption. Enable the device certificate capability to turn this on. The device certificate contains the player's public device key, and the player stores its private device key.

Public key encryption provides a trust relationship between your service and the player. However, public key encryption is computationally expensive, so a two-phase encryption improves performance. First, you create a device session key and use it to encrypt as many content keys as you wish using the less expensive secret key encryption. Then, you encrypt the device session key with the player's public device key using the more expensive public key encryption. The device session key is encrypted only once for a session. When the player requests content, your service returns both of these encrypted keys.

The player uses its private device key to decrypt the device session key. Then it uses the decrypted device session key to decrypt the content key. Finally, the player uses the decrypted content key to decrypt the content and play it.

Your service may continue to use the same device session key to encrypt as many content keys as you wish. You can also change the device session key at any time. If you do so, use the new device session key to encrypt content keys. Send the new encrypted device session and content keys in your response.

When you implement strong encryption, the following decryption occurs on the Sonos player when getMediaURI or getContentKey return results:

Returned by your service

Processing by the Sonos player

Encrypted device session key

Decrypt the device session key using the player's private device key.

Encrypted content key

Use the decrypted device session key to decrypt the content key.

URI to the encrypted content returned in getMediaURI

Get the content from the content server, decrypt the content using the decrypted content key, and play the decrypted content.

 

Encryption items

To get started, you first need to enable Sonos device certificates by selecting "Requires Device Certificate" when submitting your service to Sonos. Depending on how you want to implement sessions between your service and Sonos players, you may want to also select "Include Zone Player IDs in credentials header". For descriptions of these and other capabilities that you can enable, and instructions on how to test them during development, see Add your service with customSD. The submission process is described in Integrating a Music Service with Sonos

The Device Certificate

A certificate is an electronic document used to prove ownership of a public key and contains information about the key and owner of the key. 

Your service uses the device certificate to verify and validate the Sonos players. The device certificate holds the player's public device key, which is associated with a private device key only known to the player. 

When you enable the “Requires Device Certificate” capability, each Sonos player sends its device certificate in the deviceCert sub-element of the credentials SOAP header for every getMediaURI and getContentKey request. The device certificate is a standard SSL X.509 certificate in base64 using the Abstract Syntax Notation One with the distinguished encoding rules standard (ASN.1 DER). 

See the device session token and the device session key for more about how to use the device certificate. See Requests and Responses for more details about elements in the SOAP header.

The device session token

A device session token is similar to a website cookie, representing a session between your service and a specific Sonos player device.

The Sonos player sends a device session token in the deviceSessionToken parameter for getMediaURI and getContentKey requests. However, the first time a player makes a request, the deviceSessionToken parameter is empty. Your service is in complete control of managing the device session token. Your service validates the device certificate of the requesting player, creates the device session token to represent your session with the player, and returns it in the deviceSessionToken element.

The device session token is opaque to the player, meaning it is not interpreted by the player but only used as context between the specific player and your service. The player does a simple string comparison and if the token has changed, the player replaces its cached value to use in subsequent getMediaURI or getContentKey requests. If your service omits the deviceSessionToken in a response (such as when the content is not encrypted), the player will still continue to send any previously cached token, so omitting the deviceSessionToken in your response does not clear the player's cache.

Use the device session token so that you can easily validate the player without re-validating the device certificate, which may be computationally expensive to do for every getMediaURI or getContentKey request from that player. Since this value is opaque to the player, it is up to you when to generate and replace a deviceSessionToken.

The device session key

A device session key is used to encrypt and decrypt content keys.

When your service implements strong encryption for getMediaURI and getContentKey requests, you use the device session key in a two-phase encryption:

  1. Create a randomized device session key and use it to encrypt content keys using secret key encryption. 
  2. Encrypt the device session key with the player's public device key (from the device certificate) using public key encryption.

 

Your service returns to the Sonos player the encrypted content key and the encrypted device session key. The player uses its private device key to decrypt the device session key and then uses the result to decrypt the content key.

It is up to you when and how often to create a new device session key for encrypting content keys. Note that the device session key is independent of the device session token and may be changed separately.

The device session key provides encryption optimization for your server. Rather than use computationally expensive public key encryption on the content keys themselves for every request, you only need to use public key encryption once on the device session key for a session of multiple requests from a player. 

Your service returns the encrypted device session key in the deviceSessionKey element as a hexadecimal-formatted string encrypted using the public device key and RSA encryption with Optimal Asymmetric Encryption Padding (OAEP). Set the type attribute of the deviceSessionKey element to the same symmetric encryption routine you use to encrypt the content keys. Ensure the encryption routine is one Sonos supports. See the deviceSessionKey description of getMediaURI or getContentKey for details.

Encrypting Tracks

For tracks, the Sonos player sends your service a getMediaURI request with additional security information. Your service responds by providing Sonos players with the following, depending on the level of encryption your service implements: 

Basic Encryption Response

Strong Encryption Response

  • URI to encrypted content 
  • device session token
  • content key
  • URI to encrypted content
  • device session token
  • content key encrypted with the device session key using secret key encryption
  • device session key encrypted with public key encryption

 

The following workflow shows an overview of the encryption and decryption process for tracks. 

The workflow is described as follows:

  1. Encrypt your content using a type of secret key encryption that Sonos supports. See the contentKey description of getMediaURI or getContentKey for details.

  2. Each Sonos player created a unique device certificate that holds its device public key. The players also generate a device private key. When you enabled the “Requires Device Certificate” capability, the players send the deviceCert in the header of every getMediaURI and getContentKey request. 

  3. Sonos players make getMediaURI requests that include the parameters ​id, for the track identifier, and deviceSessionToken, representing the player's session with your service. This value is empty the first time a player makes a getMediaURI request.

Service Encryption Implementation

See the Java Code Example for a sample implementation.

  1. Access the device certificate.
    Your implementation uses the device certificate to verify the player and to implement the device session token and the device session key. 

  2. Create the device session token.
    Create a device session token that represents a session between a Sonos player and your service. The player and your service exchange the same token until you decide your service should send a new one. 

  3. For strong encryption, create a device session key to encrypt the content key.
    Your service may continue to use the same device session key to encrypt content keys for as many of a player's requests as you wish. (For basic encryption, you do not use a device session key.) 

  4. Process the track's content key.
    For strong encryption, encrypt the track's content key using the device session key. Ensure the encryption routine is one Sonos supports. See the deviceSessionKey description of getMediaURI for details. (For basic encryption, the content key is unencrypted.) 

  5. Set the getMediaURIResponse elements. 

    • Set the getMediaURIResult element to the URI of the requested encrypted track.

    • Set the deviceSessionToken element to a string representation of your device session token.

    • Set the deviceSessionKey element for strong encryption. Encrypt your device session key using the public device key, and then set the deviceSessionKey element to a string representation of the encrypted device session key. Set the type attribute to the symmetric encryption routine you used to encrypt the content key.

    • Set the contentKey element value.
      For basic encryption, set the contentKey value to a string representation of the content key and the optional initialization vector (IV) your service used to encrypt the content. For strong encryption, set the contentKey value to a string representation of the encrypted content key and encrypted IV. 

    • Set the content key encryption type.
      Set the type attribute of the contentKey element to the type of secret key encryption that was used to encrypt the content. See the contentKey description of getMediaURI for details. 

The following XML code examples show service responses to getMediaURI requests for an encrypted track. If the particular requested track is not encrypted, you can omit the deviceSessionToken, contentKey, and deviceSessionKey elements.

A getMediaURIResponse for basic encryption:

<s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/" xmlns:ns="http://www.sonos.com/Services/1.1">
 <s:Body>
  <ns:getMediaURIResponse>
   <ns:getMediaURIResult>https://ec-media.sndcdn.com/g4xpZRZg1HHV.128.mp3?f10880d39085a94a0418a7ef69b03d522cd6dfee9399eeb9a522029469fbbe34fd477cbecebff0c93829a5d575930a25a69dde117a84c6304052bfc2eb6e640178f8416f76</ns:getMediaURIResult>
   <ns:deviceSessionToken>FEDCBA0123456789ABCDEF</ns:deviceSessionToken>
   <ns:contentKey type="AES-CBC">0123456789ABCDEF0123456789ABCDEF:FEDBCA9876543210FEDBCA9876543210</ns:contentKey>
  </ns:getMediaURIResponse>
 </s:Body>
</s:Envelope>

A getMediaURIResponse for strong encryption, where the content key is encrypted using the device session key.

<s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/" xmlns:ns="http://www.sonos.com/Services/1.1">
 <s:Body>
  <ns:getMediaURIResponse>
   <ns:getMediaURIResult>https://ec-media.sndcdn.com/g4xpZRZg1HHV.128.mp3?f10880d39085a94a0418a7ef69b03d522cd6dfee9399eeb9a522029469fbbe34fd477cbecebff0c93829a5d575930a25a69dde117a84c6304052bfc2eb6e640178f8416f76</ns:getMediaURIResult>
   <ns:deviceSessionToken>FEDCBA0123456789ABCDEF</ns:deviceSessionToken>
   <ns:deviceSessionKey type="AES-ECB">FEDBCA9876543210FEDBCA9876543210</ns:deviceSessionKey>
   <ns:contentKey type="AES-CBC">0123456789ABCDEF0123456789ABCDEF:FEDBCA9876543210FEDBCA9876543210</ns:contentKey>
  </ns:getMediaURIResponse>
 </s:Body>
</s:Envelope>
  1. Return the getMediaURIResponse

Sonos Player Decryption

  1. If the deviceSessionKey was returned, the Sonos player decrypts the deviceSessionKey using its private device key. Then the player uses the decrypted deviceSessionKey to decrypt the contentKey.

  2. The Sonos player makes an HTTP GET request for the content from your service's content server, which returns the encrypted content to the Sonos player.

  3. The Sonos player uses the decrypted contentKey value and type to decrypt the content.

  4. Finally, the player plays the decrypted content.

 

Encrypting HTTP live streaming (HLS) media

Encrypting and decrypting HLS content is similar to the process described for tracks, except that Sonos uses getContentKey instead of getMediaURI to get content keys to decrypt the HLS segments. 

When HLS content is encrypted, an HLS media playlist (a file of type .m3u8) is used to store both the URIs to the encrypted segments as well as URIs to the content keys (tagged as EXT-X-KEY in the media playlist). A content key can be used to encrypt more than one segment. Ensure the secret key encryption you use is one Sonos supports. See the contentKey description in getContentKey for details. 

HTTP Live Streaming (HLS) describes the workflow when a user plays HLS content. When the Sonos player retrieves an HLS media playlist from your content server, it processes the media playlist URIs as follows:

  • For each content key URI from the playlist, the player makes a getContentKey request. Details of how a service responds are shown below. 
  • For each HLS segment URI from the playlist, the player requests an HTTP GET for the HLS segment URI from your service's content server, which returns the encrypted segment to the Sonos player. The Sonos player uses the cached decrypted contentKey to decrypt the segment and then play it.

When a Sonos player sends your service a getContentKey request with additional security information, your service responds by providing Sonos players with the following, depending on the level of encryption your service implements: 

Basic Encryption Response

Strong Encryption Response

  • device session token
  • content key
  • device session token
  • content key encrypted with the device session key using secret key encryption
  • device session key encrypted with public key encryption

 

A service implementation of encryption for getContentKey is described as follows. See also the Java Code Example for a sample implementation. 

  1. Access the device certificate.
    Your implementation uses the device certificate to verify the player and to implement the device session token and the device session key, described below. 

  2. Create the device session token.
    Create a device session token that represents a session between a Sonos player and your service. The player and your service exchange the same token until you decide your service should send a new one. 

  3. For strong encryption, create a device session key to encrypt the content key.
    Your service may continue to use the same device session key to encrypt content keys for as many of a player's requests as you wish. (For basic encryption, you do not use a device session key.)

  4. Process the content key.
    For strong encryption, encrypt the content key using the device session key. Ensure the encryption routine is one Sonos supports. See the deviceSessionKey description of getContentKey for details. (For basic encryption, the content key is unencrypted.) 

  5. Set the getContentKeyResponse elements to be returned to the player. 

    • Set the uri element to the URI of the associated audio stream.

    • Set the deviceSessionToken element to a string representation of your device session token.

    • Set the deviceSessionKey element for strong encryption.
      Encrypt your device session key using the public device key and then set the deviceSessionKey element to a string representation of the encrypted device session key. Set the type attribute to the symmetric encryption routine you used to encrypt the content key.

    • Set the contentKey element value.
      For basic encryption, set the contentKey value to a string representation of the content key and the optional initialization vector (IV) your service used to encrypt the content. For strong encryption, set the contentKey value to a string representation of the encrypted content key and encrypted IV. 

    • Set the content key encryption type.
      Set the type attribute of the contentKey element to the type of secret key encryption that was used to encrypt the content. See the contentKey description of getContentKey for details. 

A basic encryption response to a getContentKey request for an HLS content key:

<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:ns="http://www.sonos.com/Services/1.1">
    <soap:Body>
        <ns:getContentKeyResponse>
            <ns:uri>...</ns:uri>
            <ns:deviceSessionToken>234567890-90</ns:deviceSessionToken>
            <ns:contentKey type="AES-CBC">0123456789ABCDEF0123456789ABCDEF:FEDBCA9876543210FEDBCA9876543210</ns:contentKey>
        </ns:getContentKeyResponse>
    </soap:Body>
</soap:Envelope>

A strong encryption response to a getContentKey request for an HLS content key, where the content key is encrypted using the device session key:

<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:ns="http://www.sonos.com/Services/1.1">
    <soap:Body>
        <ns:getContentKeyResponse>
            <ns:uri>...</ns:uri>
            <ns:deviceSessionToken>234567890-90</ns:deviceSessionToken>
            <ns:deviceSessionKey type="AES-ECB">FEDBCA9876543210FEDBCA9876543210</ns:deviceSessionKey>
            <ns:contentKey type="AES-CBC">0123456789ABCDEF0123456789ABCDEF:FEDBCA9876543210FEDBCA9876543210</ns:contentKey>
        </ns:getContentKeyResponse>
    </soap:Body>
</soap:Envelope>


When the Sonos player gets a response from getContentKey, it decrypts the device session key using its private device key. The player then uses the decrypted device session key to decrypt the content key. Since a retrieved content key for HLS will often be used to decrypt more than one HLS segment, the player caches the decrypted content key for use when the player processes the HLS segment URIs.

Java Code Example

The following Java code shows an example of a service's encryption implementation for strong encryption:

// ==========  Strong Encryption Implementation ================================
//
// ---getMediaURI parameters---     ---getContentKey parameters---
// trackId (to requested track)     uri (to requested HLS content key) 
// deviceSessionToken               deviceSessionToken
//                                  streamId (for HLS stream currently playing)                                
. . . 
 
// Get the device certificate from the request header.
X509Certificate cert = requestHeaders.getX509DeviceCert();
 
// Set the device session token (not shown).
. . .
 
// Get the content key information. 
ContentKey encryptionKeys = new ContentKey();
​DRMInfo drmInfo = catalogDatabase.getDRMInfo(trackId);   // for getMediaURI 
// DRMInfo drmInfo = catalogDatabase.getDRMInfo(uri);    // for getContentKey
 
// Create a device session key. 
SecretKey sessionKey = null;
KeyGenerator keyGen = KeyGenerator.getInstance("AES");
keyGen.init(128);
sessionKey = keyGen.generateKey();
 
// Encrypt the content key with the device session key.
EncryptionContext contentKey = new EncryptionContext();
try {
    byte[] contentKeyValue = Hex.decodeHex(drmInfo.getEncryptionKey().toCharArray());
    Cipher encryptCipher = Cipher.getInstance("AES/ECB/NoPadding"); 
    encryptCipher.init(Cipher.ENCRYPT_MODE, sessionKey);   
    String encryptedKey = Hex.encodeHexString(encryptCipher.doFinal(contentKeyValue));
    encryptCipher.init(Cipher.ENCRYPT_MODE, sessionKey);
    byte[] contentIvValue = Hex.decodeHex(drmInfo.getIv().toCharArray());
    String encryptedIv = Hex.encodeHexString(encryptCipher.doFinal(contentIvValue));
 
    contentKey.setValue(encryptedKey + ":" + encryptedIv);  
 
    contentKey.setType(EncryptionType.valueOf(drmInfo.getType())); 
    encryptionKeys.setContentKey(contentKey);
}
. . .
 
//  Encrypt the device session key with the public device key.
EncryptionContext deviceSessionKey = new EncryptionContext();
try {
    PublicKey certPublicKey = cert.getPublicKey();  
    Cipher encryptCipher = Cipher.getInstance("RSA/ECB/OAEPWithSHA1AndMGF1Padding");
    encryptCipher.init(Cipher.ENCRYPT_MODE, certPublicKey);    
    byte[] encryptedMessage = encryptCipher.doFinal(sessionKey.getEncoded()); 
 
    String hexMessage = Hex.encodeHexString(encryptedMessage);  
    deviceSessionKey.setValue(hexMessage);
 
    deviceSessionKey.setType(EncryptionType.AES_ECB);    
    encryptionKeys.setDeviceSessionKey(deviceSessionKey);
}
. . .
 
// Return the response.
GetMediaURIResponse response = new GetMediaURIResponse();        // for getMediaURI
// GetContentKeyResponse response = new GetContentKeyResponse(); // for getContentKey
response.setContentKey(encryptionKeys);
. . .