Secure Audio Streaming

TL;DR
The article outlines the challenges and solutions involved in securing audio streaming for the Knížkomat project, where it was crucial to prevent unauthorized downloads of audiobooks. It highlights the limitations of the standard HTML <audio> tag, which can expose audio file URLs and allow direct downloads. The proposed solution involves using a hidden audio tag with a custom control panel and middleware to manage user authorization and secure requests. This setup prevents users from easily downloading content while allowing controlled access, achieving a balance between usability and security.

One of the requirements for the Knížkomat project was to "ensure that audio books are played so that people cannot download them". In practice, it is not possible to completely forbid downloading, as if the audio is played on the user's computer, there is nothing to prevent them from uploading the audio via external programs. The same applies to playing videos or simply displaying text. All information can be downloaded somehow. So the real challenge was to prevent the main and easiest way to download audio from a website. Later on, this problem extended to the basic prevention of a user from playing an audio recording if they are not logged in, or should not have access to that recording, as this too is not provided by the basic implementation of audio on the web.

Problem Formulation

If we start from the basic implementation of audio playback using the HTML <audio> tag, which contains a link to an audio file located somewhere on the server, this creates several problems.

  1. This element often provides a direct download button for the download of the audio it plays.
  2. If the user is knowledgeable, they can view the source code of the page to find the address of the audio file. When he opens this address, the file is downloaded in its entirety.
  3. The audio tag generates its own requests to retrieve the individual parts of the audio file for loading. These requests contain only the cookies that the user has set and the URL of the file itself. The cookies may store the user's identification, but in the case of Knížkomat, where the cookies are only used for authorization between the frontend and the middleware (middleware authorizes itself to backend via JWT tokens), the backend that receives a request for a part of the audio would not be able to identify the user. It would also be possible to add some identification to the URL that the backend could understand, but then the authenticated user could distribute that address to other unauthenticated users.

Possible Solutions

The first problem can be solved relatively easily, just hide the audio object on the page and create a custom control panel that doesn't include a download button or an easy to copy URL. This step alone will discourage most non-savvy users, and many sites will settle for this solution as it is "good enough".

The second problem is more complex and platforms like Spotify solve it by completely implementing audio playback themselves, where the address of the original file is not visible in the page code, which is a very complex and expensive way to do it. There are also off-the-shelf solutions for audio playback, but these often have problems with continuous loading, scrubbing through the audio track, etc., the basic audio tag is the best possible foundation in this regard.

The third problem is again only solvable by a custom implementation of audio playback, where the programmer uses custom formats for communication between the frontend and backend. But even here, the user can track the requests that leave their computer, getting the file address from them.

Our Solution

Although the third problem is caused by our use of middleware, in the end it is this layer that plays a key role in the security of our solution.

First, let's introduce the frontend implementation. Here we use the standard HTML audio tag, which is hidden and a custom control panel is created for it, which interacts with the audio tag using JavaScript. This provides us with both a user-friendly and nice looking control and a robust system of sequential audio loading, i.e. streaming, which is handled by the audio tag itself.

Requests from the audio tag arrive at the middleware, where the user is authorized by the cookies that came with the request. This ensures that the user can actually access the audio. It would be tempting to add the user's JWT token to the request and send it to the backend, but if the user somehow found out the address of the backend API (by default they are shielded from it by middleware, but it is a public URL) and their JWT token (which they also don't have access to by default), they could download the entire file at once from the backend API. Therefore, the middleware adds a secret code shared between the middleware and the backend to the request, confirming that the request was indeed generated in the middleware. Since the middleware and the backend run on the same server, this communication is on the localhost, so there is no need to worry about this code being exposed.

So let's review how our solution eliminates the outlined problems:

  1. We use a custom control panel that does not allow easy download via a button.
  2. The URL and all requests that the user sees are directed to the middleware, which does not have access to the entire audio file at once and is therefore unable to provide it to the user for download.
  3. Even if the user gets the actual address of the file on the backend, they cannot download it because they don't know the secret code.

This translation of the middleware requirements does not introduce any significant slowdown in playback and meets all requirements to a reasonable degree. Of course, we realize that our solution is not perfect and we can think of several possible workarounds ourselves. Nevertheless, we have achieved that the audio streaming on Knížkomat is secure beyond the capabilities of the average and knowledgeable user.

frontendbackend

NoxLabs is a team of engineers and designers specializing in web and mobile development. We're passionate about building beautiful software and welcome new project ideas.

Want to work with us? Drop us a message