Audio/Video

How to clone a specific ffmpeg version from git

git clone -b n6.1 --depth 1 https://git.ffmpeg.org/ffmpeg.git ffmpeg

This will clone version 6.1 (-b n6.1) and only perform a shallow clone (i.e. only download the neccessary files for this specific version).

Posted by Uli Köhler in Audio/Video, Video

Python script to re-encode all audio files to OPUS recursively

This Python script recursively converts all FLAC, WAV and MP3 etc files recursively in the given input directory to OPUS. The default bitrate is 96k which is roughly equivalent to 192k to 256k MP3. You can select a different

After encoding, the source file and converted file are checked for their length automatically. If the length differs more than 0.2 seconds, the encoding is considered failed.

There is an option --delete to delete the source file (only if the OPUS file has equivalent length to the source file of course). Use with caution and make backups before using, no warranty is expressed or implied for this script – it might lose some of your music!

The script automatically performs parallel encoding, hence it’s pretty fast even if transcoding many files.

No files are overwritten unless the -o/--overwrite is given on the command line.

How to use

First, install ffmpeg and ffprobe. Now install the python binding for ffmpeg using

pip install ffmpeg-python

Here’s the command line options the script provides:

usage: ConvertToOPUS.py [-h] [--bitrate BITRATE] [--delete] [-o] [-j THREADS] directory

Transcode MP3, FLAC & WAV files to OPUS.

positional arguments:
  directory             Directory containing the files to transcode

options:
  -h, --help            show this help message and exit
  --bitrate BITRATE     Bitrate for the OPUS files
  --delete              Delete original files after successful transcoding
  -o, --overwrite       Overwrite existing OPUS files
  -j THREADS, --threads THREADS
                        Number of parallel transcodes

Source code

#!/usr/bin/env python3
import os
import sys
import argparse
import ffmpeg
from concurrent.futures import ProcessPoolExecutor
import logging

logging.basicConfig(level=logging.INFO, format='%(levelname)s: %(message)s')

def get_file_duration(file_path):
    try:
        probe = ffmpeg.probe(file_path)
        audio_stream = next((stream for stream in probe['streams'] if stream['codec_type'] == 'audio'), None)
        return float(audio_stream['duration'])
    except Exception as e:
        logging.error(f"Failed to get duration for {file_path}: {e}")
        return None

def transcode_file(file_path, bitrate, delete_after_transcode, overwrite):
    try:
        output_file = file_path.rsplit('.', 1)[0] + '.opus'
        if not overwrite and os.path.exists(output_file):
            logging.info(f"{output_file} exists. Skipping due to no overwrite flag.")
            return

        stream = (
            ffmpeg
            .input(file_path)
            .output(output_file, ab=bitrate, loglevel="error")
        )

        if overwrite:
            stream = stream.overwrite_output()

        stream.run()

        original_duration = get_file_duration(file_path)
        transcoded_duration = get_file_duration(output_file)

        if original_duration is None or transcoded_duration is None:
            logging.error(f"Failed to transcode {file_path}. Could not retrieve file duration.")
            os.remove(output_file)
            return

        duration_diff = abs(original_duration - transcoded_duration)
        if duration_diff >= 0.2:
            logging.error(f"Transcoding failed for {file_path}. Duration mismatch.")
            os.remove(output_file)
        else:
            logging.info(f"Transcoded {file_path} to {output_file} successfully.")
            if delete_after_transcode:
                os.remove(file_path)
    except Exception as e:
        logging.error(f"Failed to transcode {file_path}: {e}")

def main():
    parser = argparse.ArgumentParser(description="Transcode MP3, FLAC & WAV files to OPUS.")
    parser.add_argument("directory", type=str, help="Directory containing the files to transcode")
    parser.add_argument("--bitrate", type=str, default="96k", help="Bitrate for the OPUS files")
    parser.add_argument("--delete", action="store_true", help="Delete original files after successful transcoding")
    parser.add_argument("-o", "--overwrite", action="store_true", help="Overwrite existing OPUS files")
    parser.add_argument("-j", "--threads", type=int, default=os.cpu_count(), help="Number of parallel transcodes")

    args = parser.parse_args()

    supported_extensions = ['mp3', 'flac', 'wav', 'm4a']

    # List all files with the supported extensions in the directory recursively
    files_to_transcode = []
    for root, dirs, files in os.walk(args.directory):
        for file in files:
            if file.split('.')[-1].lower() in supported_extensions:
                files_to_transcode.append(os.path.join(root, file))

    logging.info(f"Found {len(files_to_transcode)} files to transcode.")

    with ProcessPoolExecutor(max_workers=args.threads) as executor:
        for file_path in files_to_transcode:
            executor.submit(transcode_file, file_path, args.bitrate, args.delete, args.overwrite)

if __name__ == '__main__':
    main()

 

Posted by Uli Köhler in Audio, Audio/Video, Python

How to capture image as NumPy array using PiCamera2

This will capture a raspberry pi camera image as numpy array.

The default size that will be used is 640x480px

#!/usr/bin/env python3
import time
import picamera2
import numpy as np

with picamera2.Picamera2() as camera:
    camera.start()
    time.sleep(1)
    array = camera.capture_array("main")
    # TODO Do something with array
    print(array.shape)

Example output:

[0:27:57.224504277] [3117]  INFO Camera camera_manager.cpp:297 libcamera v0.0.5+83-bde9b04f
[0:27:57.258472502] [3118]  INFO RPI vc4.cpp:437 Registered camera /base/soc/i2c0mux/i2c@1/imx477@1a to Unicam device /dev/media3 and ISP device /dev/media0
[0:27:57.258611296] [3118]  INFO RPI pipeline_base.cpp:1101 Using configuration file '/usr/share/libcamera/pipeline/rpi/vc4/rpi_apps.yaml'
[0:27:57.264790966] [3117]  INFO Camera camera.cpp:1033 configuring streams: (0) 640x480-XBGR8888 (1) 2028x1520-SBGGR12_CSI2P
[0:27:57.265395993] [3118]  INFO RPI vc4.cpp:565 Sensor: /base/soc/i2c0mux/i2c@1/imx477@1a - Selected sensor format: 2028x1520-SBGGR12_1X12 - Selected unicam format: 2028x1520-pBCC
(480, 640, 4)

 

Posted by Uli Köhler in Audio/Video, Raspberry Pi

How to list all V4L cameras using v4l2-ctl

You can list all connected cameras using

v4l2-ctl --list-devices

Example output:

bcm2835-codec-decode (platform:bcm2835-codec):
        /dev/video10
        /dev/video11
        /dev/video12
        /dev/video18
        /dev/video31
        /dev/media0

bcm2835-isp (platform:bcm2835-isp):
        /dev/video13
        /dev/video14
        /dev/video15
        /dev/video16
        /dev/video20
        /dev/video21
        /dev/video22
        /dev/video23
        /dev/media2
        /dev/media3

rpivid (platform:rpivid):
        /dev/video19
        /dev/media1

HD USB Camera: HD USB Camera (usb-0000:01:00.0-1.2):
        /dev/video0
        /dev/video1
        /dev/media4

 

Posted by Uli Köhler in Audio/Video, Linux

How to fix dji_irp: error while loading shared libraries: libdirp.so: cannot open shared object file: No such file or directory

Problem:

When you try to run dji_irp from the DJI thermal SDK on Linux, you see the following error message:

./utility/bin/linux/release_x64/dji_irp: error while loading shared libraries: libdirp.so: cannot open shared object file: No such file or directory

Solution:

The libdirp.so library is included with the SDK but it is in a subfolder (the same subfolder where dji_irp is located) where the shell can’t find it.

In order to fix the issue, prefix the command you’re using with

LD_LIBRARY_PATH=./utility/bin/linux/release_x64/

For example:

LD_LIBRARY_PATH=./utility/bin/linux/release_x64/ ./utility/bin/linux/release_x64/dji_irp

 

 

Posted by Uli Köhler in Audio/Video, Linux

Raspberry Pi libcamera VLC recording to H.264 (1920×1080)

On the Pi, run

libcamera-vid -t 0 --width 1920 --height 1080 --codec h264 -o out.h264

This will record Full-HD video (1920×1080) to out.h264

Posted by Uli Köhler in Audio/Video, Raspberry Pi

How to list available cameras on Raspberry Pi (libcamera)

Use this command to list all available cameras:

libcamera-still --list-cameras

Example output:

$ libcamera-still --list-cameras
Available cameras
-----------------
0 : imx477 [4056x3040] (/base/soc/i2c0mux/i2c@1/imx477@1a)
    Modes: 'SRGGB10_CSI2P' : 1332x990 [120.05 fps - (696, 528)/2664x1980 crop]
           'SRGGB12_CSI2P' : 2028x1080 [50.03 fps - (0, 440)/4056x2160 crop]
                             2028x1520 [40.01 fps - (0, 0)/4056x3040 crop]
                             4056x3040 [10.00 fps - (0, 0)/4056x3040 crop]

 

Posted by Uli Köhler in Audio/Video, Raspberry Pi

How to fix Raspberry Pi OS raspivid: command not found

Problem:

When trying to run raspivid on Raspberry Pi OS Lite, you will see the following error message:

bash: raspivid: command not found

Solution:

In recent versions of Raspberry Pi OS, raspivid has been replaced by libcamera-vid. Therefore, use libcamera-vid instead of raspivid.

Posted by Uli Köhler in Audio/Video, Raspberry Pi

How to set and verify v4l2-ctl parameters in Python using subprocess

The following code uses the v4l2-ctl executable to get and set v4l2 parameters such as exposure_absolute. It also provides means of writing a parameter and verifying if it has been set correctly.

def v4l2_set_parameters_once(params, device="/dev/video0"):
    """
    Given a dict of parameters:
    {
        "exposure_auto": 1,
        "exposure_absolute": 10,
    }
    this function sets those parameters using the v4l2-ctl command line executable
    """
    set_ctrl_str = ",".join([f"{k}={v}" for k,v in params.items()]) # expsosure_absolute=400,exposure_auto=1
    subprocess.check_output(["v4l2-ctl", "-d", device, f"--set-ctrl={set_ctrl_str}"])

def v4l2_get_parameters(params, device="/dev/video0"):
    """
    Query a bunch of v4l2 parameters.
    params is a list like
    [
        "exposure_auto",
        "exposure_absolute"
    ]
    
    Returns a dict of values:
    {
        "exposure_auto": 1,
        "exposure_absolute": 10,
    }
    """
    get_ctrl_str = ",".join([f"{k}" for k in params])
    out = subprocess.check_output(["v4l2-ctl", "-d", device, f"--get-ctrl={get_ctrl_str}"])
    out = out.decode("utf-8")
    result = {}
    for line in out.split("\n"):
        # line should be like "exposure_auto: 1"
        if ":" not in line:
            continue
        k, _, v = line.partition(":")
        result[k.strip()] = v.strip()
    return result

def v4l2_set_params_until_effective(params, device="/dev/video0"):
    """
    Set V4L2 params and check if they have been set correctly.
    If V4L2 does not confirm the parameters correctly, they will be set again until they have an effect
    
    params is a dict like {
        "exposure_auto": 1,
        "exposure_absolute": 10,
    }
    """
    while True:
        v4l2_set_parameters_once(params, device=device)
        result = v4l2_get_parameters(params.keys(), device=device)
        # Check if queried parameters match set parameters
        had_any_mismatch = False
        for k, v in params.items():
            if k not in result:
                raise ValueError(f"Could not query {k}")
            # Note: Values from v4l2 are always strings. So we need to compare as strings
            if str(result.get(k)) != str(v):
                print(f"Mismatch in {k} = {result.get(k)} but should be {v}")
                had_any_mismatch = True
        # Check if there has been any mismatch
        if not had_any_mismatch:
            return

Usage example:

v4l2_set_params_until_effective({
    "exposure_auto": 1,
    "exposure_absolute": 1000,
})

 

Posted by Uli Köhler in Audio/Video, Linux, OpenCV, Python

How are OpenCV CAP_PROP_… mapped to V4L2 ctrls / parameters?

From both the OpenCV documentation and the V4L2 documentation, it is unclear how all the CAP_PROP_... parameters are mapped to v4l2 controls such as exposure_absolute.

However, you can easily look in the source code (int capPropertyToV4L2(int prop) in cap_v4l.cpp) in order to see how the parameters are mapped internally. Github link to the source code

Continue reading →

Posted by Uli Köhler in Audio/Video, Linux, OpenCV

How to get length/duration of video file in Python using ffprobe

In our previous post How to get video metadata as JSON using ffmpeg/ffprobe we showed how to generate json-formatted output using ffprobe which comes bundled with ffmpeg.

Assuming ffprobe is installed, you can easily use this to obtain the duration of a video clip (say, in  input.mp4) using Python:

import subprocess
import json

input_filename = "input.mp4"

out = subprocess.check_output(["ffprobe", "-v", "quiet", "-show_format", "-print_format", "json", input_filename])

ffprobe_data = json.loads(out)
duration_seconds = float(ffprobe_data["format"]["duration"])
# Example: duration_seconds = 11.6685

When writing such code, be aware of the risk of shell code injection if you don’t using subprocess correctly!

Posted by Uli Köhler in Audio/Video, Python

How to get video metadata as JSON using ffmpeg/ffprobe

You can easily use ffprobe to extract metadata from a given video file (input.mp4 in this example):

ffprobe -v quiet -show_format -show_streams -print_format json input.mp4

Depending on what info you need, you can also omit -show_streams which doesn’t print detailed codec info for the Audio/Video streams but just general data about the file:

ffprobe -v quiet -show_format -print_format json input.mp4

Example output (without -show-streams):

{
    "format": {
        "filename": "input.mp4",
        "nb_streams": 2,
        "nb_programs": 0,
        "format_name": "mov,mp4,m4a,3gp,3g2,mj2",
        "format_long_name": "QuickTime / MOV",
        "start_time": "0.000000",
        "duration": "11.668500",
        "size": "25045529",
        "bit_rate": "17171378",
        "probe_score": 100,
        "tags": {
            "major_brand": "mp42",
            "minor_version": "0",
            "compatible_brands": "isommp42",
            "creation_time": "2022-10-20T19:00:13.000000Z",
            "location": "+48.1072+011.7441/",
            "location-eng": "+48.1072+011.7441/",
            "com.android.version": "12",
            "com.android.capture.fps": "30.000000"
        }
    }
}

 

Example output (with -show_streams):

{
    "streams": [
        {
            "index": 0,
            "codec_name": "h264",
            "codec_long_name": "H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10",
            "profile": "High",
            "codec_type": "video",
            "codec_tag_string": "avc1",
            "codec_tag": "0x31637661",
            "width": 1920,
            "height": 1080,
            "coded_width": 1920,
            "coded_height": 1080,
            "closed_captions": 0,
            "has_b_frames": 0,
            "pix_fmt": "yuv420p",
            "level": 40,
            "color_range": "tv",
            "color_space": "bt709",
            "color_transfer": "bt709",
            "color_primaries": "bt709",
            "chroma_location": "left",
            "refs": 1,
            "is_avc": "true",
            "nal_length_size": "4",
            "r_frame_rate": "30/1",
            "avg_frame_rate": "10170000/349991",
            "time_base": "1/90000",
            "start_pts": 0,
            "start_time": "0.000000",
            "duration_ts": 1049973,
            "duration": "11.666367",
            "bit_rate": "16914341",
            "bits_per_raw_sample": "8",
            "nb_frames": "339",
            "disposition": {
                "default": 1,
                "dub": 0,
                "original": 0,
                "comment": 0,
                "lyrics": 0,
                "karaoke": 0,
                "forced": 0,
                "hearing_impaired": 0,
                "visual_impaired": 0,
                "clean_effects": 0,
                "attached_pic": 0,
                "timed_thumbnails": 0
            },
            "tags": {
                "rotate": "90",
                "creation_time": "2022-10-20T19:00:13.000000Z",
                "language": "eng",
                "handler_name": "VideoHandle",
                "vendor_id": "[0][0][0][0]"
            },
            "side_data_list": [
                {
                    "side_data_type": "Display Matrix",
                    "displaymatrix": "\n00000000:            0       65536           0\n00000001:       -65536           0           0\n00000002:            0           0  1073741824\n",
                    "rotation": -90
                }
            ]
        },
        {
            "index": 1,
            "codec_name": "aac",
            "codec_long_name": "AAC (Advanced Audio Coding)",
            "profile": "LC",
            "codec_type": "audio",
            "codec_tag_string": "mp4a",
            "codec_tag": "0x6134706d",
            "sample_fmt": "fltp",
            "sample_rate": "48000",
            "channels": 2,
            "channel_layout": "stereo",
            "bits_per_sample": 0,
            "r_frame_rate": "0/0",
            "avg_frame_rate": "0/0",
            "time_base": "1/48000",
            "start_pts": 2016,
            "start_time": "0.042000",
            "duration_ts": 558071,
            "duration": "11.626479",
            "bit_rate": "256234",
            "nb_frames": "545",
            "disposition": {
                "default": 1,
                "dub": 0,
                "original": 0,
                "comment": 0,
                "lyrics": 0,
                "karaoke": 0,
                "forced": 0,
                "hearing_impaired": 0,
                "visual_impaired": 0,
                "clean_effects": 0,
                "attached_pic": 0,
                "timed_thumbnails": 0
            },
            "tags": {
                "creation_time": "2022-10-20T19:00:13.000000Z",
                "language": "eng",
                "handler_name": "SoundHandle",
                "vendor_id": "[0][0][0][0]"
            }
        }
    ],
    "format": {
        "filename": "input.mp4",
        "nb_streams": 2,
        "nb_programs": 0,
        "format_name": "mov,mp4,m4a,3gp,3g2,mj2",
        "format_long_name": "QuickTime / MOV",
        "start_time": "0.000000",
        "duration": "11.668500",
        "size": "25045529",
        "bit_rate": "17171378",
        "probe_score": 100,
        "tags": {
            "major_brand": "mp42",
            "minor_version": "0",
            "compatible_brands": "isommp42",
            "creation_time": "2022-10-20T19:00:13.000000Z",
            "location": "+48.1072+011.7441/",
            "location-eng": "+48.1072+011.7441/",
            "com.android.version": "12",
            "com.android.capture.fps": "30.000000"
        }
    }
}
u

 

Posted by Uli Köhler in Audio/Video

How to rotate video by 90°/180°/270° using ffmpeg

In order to rotate a video named input.mp4 using ffmpeg, use the following commands:

Rotate by 90° (clockwise)

ffmpeg -i input.mp4 -vf "transpose=1" rotated.mp4

Rotate by 180°

ffmpeg -i input.mp4 -vf "transpose=2" rotated.mp4

Rotate by 270° (clockwise)

This is equivalent to rotating 90° counter-clockwise.

ffmpeg -i input.mp4 -vf "transpose=3" rotated.mp4

 

Posted by Uli Köhler in Audio/Video

How to install avidemux3 on Ubuntu 22.04

You can install Avidemux 3 (QT) from the xtradebs PPA:

sudo apt -y install software-properties-common apt-transport-https -y
sudo add-apt-repository -y ppa:xtradeb/apps
sudo apt -y install avidemux-qt

After that, run

avidemux3_qt5

to run Avidemux 3.

Posted by Uli Köhler in Audio/Video, Linux

How to re-encode all videos in a directory as H.265 & opus using ffmpeg

In our previous post How to re-encode videos as H.265 & opus using ffmpeg for archival we listed ffmpeg commands to re-encode video files to H.265 for archival purposes.

In many cases, you want to re-encode all files in a directory. The following example command (for Linux shell such as bash) re-encodes every .avi file in the current directory, assuming that all files are non-interlaced (this affects the ffmpeg flags – see our previous post for more details).

for i in *.avi ; do ffmpeg -i "$i" -c:v libx265 -crf 26 -c:a libopus -b:a 56k "${i}.mkv" ; done

From mymovie.avi, this script will produce mymovie.avi.mkv, saving approximately 70% of the filesize (that, of course, depends heavily on the video itself).

You can easily adapt this command to your liking – for example, by encoding .mp4 files instead of .avi or usign a different CRF or audio bitrate. However, in my opinion, these are sane defaults for most video where the focus is to find a compromise between good video quality and file sie.

Posted by Uli Köhler in Audio/Video

How to query all camera parameters using v4l2-ctl

You can query all parameters using v4l2-ctl‘s --all parameter, for example:

v4l2-ctl --device /dev/video0 --all

 

Posted by Uli Köhler in Audio/Video

How to always get latest frame from OpenCV VideoCapture in Python

When working with OpenCV video capture, but when you only occasionally use images, you will get older images from the capture buffer.

This code example solves this issue by running a separate capture thread that continually saves images to a temporary buffer.

Therefore, you can always get the latest image from the buffer. The code is based on our basic example How to take a webcam picture using OpenCV in Python

video_capture = cv2.VideoCapture(0)

video_capture.set(cv2.CAP_PROP_FRAME_WIDTH, 1920)
video_capture.set(cv2.CAP_PROP_FRAME_HEIGHT, 1080)

if not video_capture.isOpened():
    raise Exception("Could not open video device")

class TakeCameraLatestPictureThread(threading.Thread):
    def __init__(self, camera):
        self.camera = camera
        self.frame = None
        super().__init__()
        # Start thread
        self.start()

    def run(self):
        while True:
            ret, self.frame = self.camera.read()

latest_picture = TakeCameraLatestPictureThread(video_capture)

Usage example:

# Convert latest image to the correct colorspace
rgb_img = cv2.cvtColor(latest_picture.frame, cv2.COLOR_BGR2RGB)
# Show
plt.imshow(rgb_img)

 

 

Posted by Uli Köhler in Audio/Video, OpenCV, Python

How to capture single PNG image using fswebcam

fswebcam -r 1920x1080 --png 9 -d /dev/video0 -D 0 test.png

where:

  • -r 1920x1080 is the resolution of the image. The camera must support this. In order to see supported resolutions, see How to list USB camera video formats using v4l2-ctl
  • --png 9 means output PNG with quality 9PNG is lossless, so the quality is just the compression factor. 9 means accept higher CPU consumption for slightly smaller filesize.
  • -d /dev/video0 means to use the camera in /dev/video0
  • -D 0 means no delay before capturing the image
  • test.png means: Write the image to test.png.
Posted by Uli Köhler in Audio/Video

How to list USB camera video formats using v4l2-ctl

v4l2-ctl --device /dev/video0 --list-formats-ext

 

Posted by Uli Köhler in Audio/Video

How to get Jitsi meet list of participants using Tapermonkey script

Use this code in Javascript:

APP.conference.listMembers()

If no members are present, this will simply be an empty list:

> APP.conference.listMembers()
[]

But if there is another participant, it will show some info:

> APP.conference.listMembers()
[ci]
0: ci {_jid: '[email protected]/b009ae10', _id: 'b009ae10', _conference: nd, _displayName: 'Foo', _supportsDTMF: false, …}
length: 1

 

Posted by Uli Köhler in Audio/Video, Javascript