Audio

What is a practical OPUS bitrate for podcasts?

My recommendation for podcast audio is:

  • If the OPUS file is mainly intended for listening, choose 32kbit/s with VBR. This will cause a little distortion to the voices
  • If the OPUS file is mainly intended for archival and later listening, choose 48 kbit/s with VBR. The difference between a WAV file and a OPUS file with 48 kbit/s for speech data is hardly distinguishable even using headphones, unless you specifically compare the data (as in A-B tests).
  • For pure archival – in order to reduce generational loss – you can of course use lossless FLAC encoding. However, practically speaking, a 64kbit/s (or 96kbit/s if you’re just too paranoid) VBR-enabled OPUS file is so transparent that it’s hardly worth spending the huge amount of hard drive space using FLAC.

Note that VBR (variable bitrate) should always be enabled for speech data, since most podcast/speech-like data contains lots of silence. Therefore, using pure constant bitrate is strongly discouraged.

Posted by Uli Köhler in Audio

How to re-encode all WAV files in the current directory to OPUS

for i in *.wav ; do ffmpeg -i "${i}" -c:a libopus -vbr on -compression_level 10 "${i}.opus" ; done

 

Posted by Uli Köhler in Audio

Python script to re-encode all audio files to OPUS recursively

This Python script recursively converts all FLAC, WAV and MP3 etc files recursively in the given input directory to OPUS. The default bitrate is 96k which is roughly equivalent to 192k to 256k MP3. You can select a different

After encoding, the source file and converted file are checked for their length automatically. If the length differs more than 0.2 seconds, the encoding is considered failed.

There is an option --delete to delete the source file (only if the OPUS file has equivalent length to the source file of course). Use with caution and make backups before using, no warranty is expressed or implied for this script – it might lose some of your music!

The script automatically performs parallel encoding, hence it’s pretty fast even if transcoding many files.

No files are overwritten unless the -o/--overwrite is given on the command line.

How to use

First, install ffmpeg and ffprobe. Now install the python binding for ffmpeg using

pip install ffmpeg-python

Here’s the command line options the script provides:

usage: ConvertToOPUS.py [-h] [--bitrate BITRATE] [--delete] [-o] [-j THREADS] directory

Transcode MP3, FLAC & WAV files to OPUS.

positional arguments:
  directory             Directory containing the files to transcode

options:
  -h, --help            show this help message and exit
  --bitrate BITRATE     Bitrate for the OPUS files
  --delete              Delete original files after successful transcoding
  -o, --overwrite       Overwrite existing OPUS files
  -j THREADS, --threads THREADS
                        Number of parallel transcodes

Source code

#!/usr/bin/env python3
import os
import sys
import argparse
import ffmpeg
from concurrent.futures import ProcessPoolExecutor
import logging

logging.basicConfig(level=logging.INFO, format='%(levelname)s: %(message)s')

def get_file_duration(file_path):
    try:
        probe = ffmpeg.probe(file_path)
        audio_stream = next((stream for stream in probe['streams'] if stream['codec_type'] == 'audio'), None)
        return float(audio_stream['duration'])
    except Exception as e:
        logging.error(f"Failed to get duration for {file_path}: {e}")
        return None

def transcode_file(file_path, bitrate, delete_after_transcode, overwrite):
    try:
        output_file = file_path.rsplit('.', 1)[0] + '.opus'
        if not overwrite and os.path.exists(output_file):
            logging.info(f"{output_file} exists. Skipping due to no overwrite flag.")
            return

        stream = (
            ffmpeg
            .input(file_path)
            .output(output_file, ab=bitrate, loglevel="error")
        )

        if overwrite:
            stream = stream.overwrite_output()

        stream.run()

        original_duration = get_file_duration(file_path)
        transcoded_duration = get_file_duration(output_file)

        if original_duration is None or transcoded_duration is None:
            logging.error(f"Failed to transcode {file_path}. Could not retrieve file duration.")
            os.remove(output_file)
            return

        duration_diff = abs(original_duration - transcoded_duration)
        if duration_diff >= 0.2:
            logging.error(f"Transcoding failed for {file_path}. Duration mismatch.")
            os.remove(output_file)
        else:
            logging.info(f"Transcoded {file_path} to {output_file} successfully.")
            if delete_after_transcode:
                os.remove(file_path)
    except Exception as e:
        logging.error(f"Failed to transcode {file_path}: {e}")

def main():
    parser = argparse.ArgumentParser(description="Transcode MP3, FLAC & WAV files to OPUS.")
    parser.add_argument("directory", type=str, help="Directory containing the files to transcode")
    parser.add_argument("--bitrate", type=str, default="96k", help="Bitrate for the OPUS files")
    parser.add_argument("--delete", action="store_true", help="Delete original files after successful transcoding")
    parser.add_argument("-o", "--overwrite", action="store_true", help="Overwrite existing OPUS files")
    parser.add_argument("-j", "--threads", type=int, default=os.cpu_count(), help="Number of parallel transcodes")

    args = parser.parse_args()

    supported_extensions = ['mp3', 'flac', 'wav', 'm4a']

    # List all files with the supported extensions in the directory recursively
    files_to_transcode = []
    for root, dirs, files in os.walk(args.directory):
        for file in files:
            if file.split('.')[-1].lower() in supported_extensions:
                files_to_transcode.append(os.path.join(root, file))

    logging.info(f"Found {len(files_to_transcode)} files to transcode.")

    with ProcessPoolExecutor(max_workers=args.threads) as executor:
        for file_path in files_to_transcode:
            executor.submit(transcode_file, file_path, args.bitrate, args.delete, args.overwrite)

if __name__ == '__main__':
    main()

 

Posted by Uli Köhler in Audio, Audio/Video, Python

How to get duration of WAV file in Python (minimal example)

Use

duration_seconds = mywav.getnframes() / mywav.getframerate()

to get the duration of a WAV file in seconds.

Full example:

import wave

with wave.open("myaudio.wav") as mywav:
    duration_seconds = mywav.getnframes() / mywav.getframerate()
    print(f"Length of the WAV file: {duration_seconds:.1f} s")

 

Posted by Uli Köhler in Audio, Python

How to auto-set Windows audio balance to a specific L-R difference using Python

When you can’t place your speakers equally far from your ears, you need to adjust the audio balance in order to compensate for the perceived difference in volume.

Windows allows you to compensate the audio volume natively using the system settings – however it has one critical issue: If you ever set your audio volume to zero, your balance settings get lost and you need to click through plenty of dialogs in order to re-configure it.

In our previous post How to set Windows audio balance using Python we showed how tp use the pycaw library to  (see that post for installation instructions etc).

The following Python script can be run to set the audio balance to. It has been designed to keep the mean (i.e. L+R) audio level in dB when adjusting the volume (i.e. it will not change the overall volume and hence avoid blowing out your eardrums) and will not do any adjustment if the balance is already within 0.1 dB.

Set desiredDelta to your desired left-right difference in dB (positive values mean that the left speaker will be louder than the right speaker)!

from ctypes import cast, POINTER
from comtypes import CLSCTX_ALL
from pycaw.pycaw import AudioUtilities, IAudioEndpointVolume
import math

# Get default audio device using PyCAW
devices = AudioUtilities.GetSpeakers()
interface = devices.Activate(
    IAudioEndpointVolume._iid_, CLSCTX_ALL, None)
volume = cast(interface, POINTER(IAudioEndpointVolume))

# Get current volume of the left channel
currentVolumeLeft = volume.GetChannelVolumeLevel(0)
# Set the volume of the right channel to half of the volume of the left channel
volumeL = volume.GetChannelVolumeLevel(0)
volumeR = volume.GetChannelVolumeLevel(1)
print(f"Before adjustment: L={volumeL:.2f} dB, R={volumeR:.2f} dB")

desiredDelta = 6.0 # Desired delta between L and R. Positive means L is louder!

delta = abs(volumeR - volumeL)
mean = (volumeL + volumeR) / 2.

# Re-configure balance if delta is not 
if abs(delta - desiredDelta) > 0.1:
    # Adjust volume
    volume.SetChannelVolumeLevel(0, mean + desiredDelta/2., None) # Left
    volume.SetChannelVolumeLevel(1, mean - desiredDelta/2., None) # Right
    # Get & print new volume
    volumeL = volume.GetChannelVolumeLevel(0)
    volumeR = volume.GetChannelVolumeLevel(1)
    print(f"After adjustment: L={volumeL:.2f} dB, R={volumeR:.2f} dB")
else:
    print("No adjustment neccessary")

 

Posted by Uli Köhler in Audio, Python, Windows

How to set Windows audio balance using Python

In our previous post we showed how to set the Windows audio volume using pycaw.

First, we install the library using

pip install pycaw

Note: pycaw does not work with WSL (Windows Subsystem for Linux)! You actually need to install it using a Python environment running on Windows. I recommend Anaconda.

In order to set the audio balance, we can use volume.SetChannelVolumeLevel(...):

from ctypes import cast, POINTER
from comtypes import CLSCTX_ALL
from pycaw.pycaw import AudioUtilities, IAudioEndpointVolume
import math

# Get default audio device using PyCAW
devices = AudioUtilities.GetSpeakers()
interface = devices.Activate(
    IAudioEndpointVolume._iid_, CLSCTX_ALL, None)
volume = cast(interface, POINTER(IAudioEndpointVolume))

# Get current volume of the left channel
currentVolumeLeft = volume.GetChannelVolumeLevel(0)
# Set the volume of the right channel to half of the volume of the left channel
volume.SetChannelVolumeLevel(1, currentVolumeLeft - 6.0, None)
# NOTE: -6.0 dB = half volume !

Note that by convention, the left channel is channel 0 and the right channel is channel 1. Depending on the type of sound card, there might be as few as 1 channel (e.g. a mono headset) or many channels like in a multichannel USB audio interface. use volume.GetChannelCount() to get the number of channels.

Posted by Uli Köhler in Audio, Python, Windows

How to set Windows audio volume using Python

We can use the pycaw library to set the Windows Audio volume using Python.

First, we install the library using

pip install pycaw

Note: pycaw does not work with WSL (Windows Subsystem for Linux)! You actually need to install it using a Python environment running on Windows. I recommend Anaconda.

Now we can set the volume to half the current volume using this script:

from ctypes import cast, POINTER
from comtypes import CLSCTX_ALL
from pycaw.pycaw import AudioUtilities, IAudioEndpointVolume
import math

# Get default audio device using PyCAW
devices = AudioUtilities.GetSpeakers()
interface = devices.Activate(
    IAudioEndpointVolume._iid_, CLSCTX_ALL, None)
volume = cast(interface, POINTER(IAudioEndpointVolume))

# Get current volume 
currentVolumeDb = volume.GetMasterVolumeLevel()
volume.SetMasterVolumeLevel(currentVolumeDb - 6.0, None)
# NOTE: -6.0 dB = half volume !

 

Posted by Uli Köhler in Audio, Python, Windows

How to re-encode your Audiobooks as Opus

Opus is a modern high-efficiency audio codec that is especially suited to encode speech with very low bitrates.

Therefore, it’s a good fit to compress your Audiobook library so it consumes much less space.

First, choose a bitrate for Opus. I recommend to use 24kbit/s (24k) for general use, or 32 kbit/s (32k) if you want to have higher quality audio, e.g. if you are listening using good-quality headphones.

You can use ffmpeg directly by using this syntax:

ffmpeg -i <input file> -c:a libopus -b:a bitrate <output file>

but I recommend to use this shell function instead:

function audioToOpus { ffmpeg -i "$2" -c:a libopus -b:a "$1" "${2%.*}.opus" ; }

Copy & paste it into your shell, then call it like this:

audioToOpus <bitrate> <input file>

Example:

audioToOpus 24k myaudiobook.mp3

This command will create myaudiobook.opus. myaudiobook.mp3 will not be deleted automatically.

In case you want to have this function available permanently, add the function definition to your ~/.bashrc or ~/.zshrc, depending on which shell you use.

Posted by Uli Köhler in Audio, Linux