Easily compute & visualize FFTs in Python using UliEngineering
UliEngineering is a mixed data analytics library in Python - one of the utilities it provides is an easy-to-use package to compute FFTs. In contrast to other package, this library is oriented towards practical usecases and allows you to do the FFT in only one line of code! No knowledge of Math required.
How to install UliEngineering
UliEngineering is a Python 3 only library. Install using pip
:
pip install --user UliEngineering
Getting started
First, we’ll generate some test data. See this previous post for more details on how to generate sinusoid test data:
from UliEngineering.SignalProcessing.Simulation import *
# Generate test data: 100 Hz + 400 Hz tone
data = sine_wave(frequency=100.0, samplerate=1000, amplitude=1.) \
+ sine_wave(frequency=400.0, samplerate=1000, amplitude=0.5)
The test data consists of a 100 Hz
sine wave plus a 400 Hz
sine wave (with half the amplitude). That signal is sampled with a sample rate of 1000 Hz
.
Now we can compute & visualize the FFT using matplotlib:
# Compute FFT. Use the same samplerate
# NOTE: Window defaults to "blackman"!
from UliEngineering.SignalProcessing.FFT import compute_fft
fft = compute_fft(data, samplerate=1e3)
# Plot
from matplotlib import pyplot as plt
plt.style.use("ggplot")
plt.gcf().set_size_inches(10, 5) # Use (20, 10) to get a larger plot
plt.plot(fft.frequencies, fft.amplitudes)
plt.xlabel("Frequency")
plt.ylabel("Amplitude")
compute_fft(data, samplerate=1e3)
returns a tuple (fftx, ffty)
. It performs an FFT the size of the input (i.e. since data is a length-1000
array, the FFT will be of size 1000
).
fft.frequencies
is an array of frequencies (in Hz), corresponding to the values in fft.amplitudes
. You can also use fft.angles
to get relative angles in degrees, but that is not being covered in this blogpost.
As you can see in the plot shown above, the maximum frequency that can be detected is always half the samplerate, i.e. for our samplerate of $f_s = 1000,\text{Hz}$ it is $500,\text{Hz}$. See this more math-centric FFT explanation if you want to know more details.
Internally, compute_fft()
performs this computation:
- $2$ is a correction factor that takes into account that we throw away the latter half of the raw FFT result (since we’re doing a real FFT)
- $\frac{1}{\text{len(data)}}$ normalizes the FFT results so they are indepedent of the data length (i.e. if you pass a longer sample of the same sine wave you will still get the same result
- $\text{abs}\left(\cdots\right)$ Converts the complex phase-aware result of the FFT to a spectrum which is easier to read & visualize.
- Window (which defaults to
blackman
) is the window which is applied to the data to alleviate some mathematical effects at the beginning and the end of the dataset. See wikipedia on window functions for more details. UliEngineering currently offers this list of window functions:blackman
bartlett
hamming
hanning
kaiser
(Parameter is fixed to2.0
)none
Selecting frequency ranges
Using the UliEngineering API, selecting a frequency range of the FFT is trivially easy: Just use fft[lowfreq:highfreq]
. You can use fft[lowfreq:]
to select everything starting from lowfreq
or use fft[:highfreq]
to select everything up to highfreq
.
from UliEngineering.SignalProcessing.FFT import compute_fft
fft = compute_fft(data, samplerate=1e3)
# Select frequency range: 50 to 200 Hz
fft = fft[50.0:200.0]
# Plot
from matplotlib import pyplot as plt
plt.style.use("ggplot")
plt.gcf().set_size_inches(10, 5) # Use (20, 10) to get a larger plot
plt.plot(fft.frequencies, fft.amplitudes)
plt.xlabel("Frequency")
plt.ylabel("Amplitude")
plt.savefig("/ram/fft-frequency-range.svg")
Extract amplitude & angle at a certain frequency
By using [frequency]
, i.e. getitem operator with a single value, you get an FFTPoint() object containing the frequency, amplitude and relative angle for a given frequency. The library automatically selects the closest FFT bucket, so even if your FFT does not have a bucket for that specific frequency, you will get sensible results.
from UliEngineering.SignalProcessing.FFT import compute_fft
fft = compute_fft(data, samplerate=1e3)
# Show value at a certain frequency
print(fft[30]) # FFTPoint(frequency=30.0, value=2.91e-08, angle=0.0)
print(fft[100]) # FFTPoint(frequency=100.0, value=0.419, angle=0.0)
Short FFTs for long data
Using compute_fft()
, if we have an extremely long data array, this means we’ll compute an extremely long FFT. In many cases, this is not desirable and you want to compute a fixed-size FFT (most often a power-of-two FFT, e.g. 1024
, 2048
, 4096
etc).
Let’s generate some long test data and assume we want to compute a size-1024
FFT on it
from UliEngineering.SignalProcessing.FFT import *
fftx, ffty = simple_serial_fft_reduce(data, samplerate=1e3, fftsize=1024)
Plotting the data like above yields
which looks almost exactly like our compute_fft()
plot before - just as we expected.
simple_serial_fft_reduce()
takes care of all the magic and normalization for us, including partitioning the data into overlapping chunks, adding the FFTs and properly normalizing the result.
The naming convention is significant here:
simple_...._reduce
means this is a variant with sensible defaults (simple
) for using areduction
function (default:sum
) on multiple FFTs.serial
means the individual FFTs
Parallelizing the FFTs
If you have a huge dataset, you can use simple_parallel_fft_reduce()
identically tosimple_serial_fft_reduce()
:
from UliEngineering.SignalProcessing.FFT import *
fftx, ffty = simple_parallel_fft_reduce(data, samplerate=1e3, fftsize=1024)
However in most cases you want to initialize the executor manually so you can re-use it later:
from UliEngineering.SignalProcessing.FFT import *
from concurrent.futures import ThreadPoolExecutor
executor = ThreadPoolExecutor() # No argument => use num_cpus threads
fftx, ffty = simple_parallel_fft_reduce(data, samplerate=1e3, fftsize=1024, executor=executor)
We can use a ThreadPoolExecutor()
since scipy.fftpack
(which UliEngineering
uses to do the hard math) unlocks the Python GIL.
Note that due to the need to do a lot of housekeeping tasks, simple_parallel_fft_reduce()
is much slower than simple_serial_fft_reduce()
if you have a dataset so small that parallelization is not effective. My initial recommendation is to consider using the parallel variant if the total execution time of the serial variant is larger than $0.5s$