How to create pandas time series DataFrame example dataset

TL;DR: Use our pre-built example dataset like this:

# Load pre-built time series example dataset
df = pd.read_csv("https://datasets.techoverflow.net/timeseries-example.csv", parse_dates=["Timestamp"])
df.set_index("Timestamp", inplace=True)

How to build your own time series example dataset

In our previous post Easily generate sine/cosine waveform data in Python using UliEngineering we showed how to generate sine and cosine waves using UliEngineering.

In this post, we show how to create a pandas DataFrame containing sine and cosine data to be used as a sample time series dataset.

First, we generate the sine and cosine wave data:

import pandas as pd
import numpy as np
from UliEngineering.SignalProcessing.Simulation import sine_wave, cosine_wave

# Configure the properties of the sine wave here
frequency = 10.0 # 10 Hz sine / cosine wave
samplerate = 10000 # 10 kHz
nseconds = 1 # Generate 1 second of data

sine = sine_wave(frequency=frequency, samplerate=samplerate, length=nseconds)
cosine = cosine_wave(frequency=frequency, samplerate=samplerate, length=nseconds)
nsamples = len(sine) # How many values we have in the data arrays

After that, we define the timestamp where the dataset starts:

start_timestamp = pd.Timestamp('now')

Now we can create a list of Timestamp objects representing the points in time where the signal has been sampled:

# Create timestamps by offsetting
timedelta = pd.Timedelta(1/samplerate, 'seconds')
timestamps =  [start_timestamp + i * timedelta for i in range(nsamples)]

Now we’re reading to create the DataFrame object:

df = pd.DataFrame(index=timestamps, data={
    "Sine": sine,
    "Cosine": cosine
})
df.index.name = 'Timestamp'

Now we can use df.plot() to plot the dataset:

# Use nice plotting style
from matplotlib import pyplot as plt
plt.style.use("ggplot")
# Plot dataset
df.plot()
# Make figure larger
plt.gcf().set_size_inches(10, 5)

Additionally we can export the dataset as CSV using

df.to_csv("/ram/timeseries-example.csv")

This example file is also available online at https://techoverflow.net/datasets/timeseries-example.csv

Full example:

#!/usr/bin/env python3
import pandas as pd
import numpy as np
from UliEngineering.SignalProcessing.Simulation import sine_wave, cosine_wave
# Configure the properties of the sine wave here
frequency = 10.0 # 10 Hz sine / cosine wave
samplerate = 10000 # 10 kHz
nseconds = 1 # Generate 1 second of data
sine = sine_wave(frequency=frequency, samplerate=samplerate, length=nseconds)
cosine = cosine_wave(frequency=frequency, samplerate=samplerate, length=nseconds)
nsamples = len(sine) # How many values we have in the data arrays

start_timestamp = pd.Timestamp('now')

# Create timestamps by offsetting
timedelta = pd.Timedelta(1/samplerate, 'seconds')
timestamps =  [start_timestamp + i * timedelta for i in range(nsamples)]

df = pd.DataFrame(index=timestamps, data={
    "Sine": sine,
    "Cosine": cosine
})
df.index.name = 'Timestamp'

df.to_csv("timeseries-example.csv")