How to compute average of pandas pd.Timestamp
Problem
If you have an array of pd.Timestamp
objects, you can’t directly compute the average since they can’t be summed directly:
import pandas as pd
# Creating an array of five fixed pd.Timestamp objects
timestamps = [
pd.Timestamp('2023-01-01 12:00:00'),
pd.Timestamp('2023-01-02 12:00:00'),
pd.Timestamp('2023-01-03 12:00:00'),
pd.Timestamp('2023-01-04 12:00:00'),
pd.Timestamp('2023-01-05 12:00:00')
]
# FAIL: This will raise a TypeError
average = sum(timestamps) / len(timestamps)
This will raise a TypeError
:
TypeError Traceback (most recent call last)
Cell In[1], line 13
4 timestamps = [
5 pd.Timestamp('2023-01-01 12:00:00'),
6 pd.Timestamp('2023-01-02 12:00:00'),
(...)
9 pd.Timestamp('2023-01-05 12:00:00')
10 ]
12 # FAIL: This will raise a TypeError
---> 13 average = sum(timestamps) / len(timestamps)
File timestamps.pyx:483, in pandas._libs.tslibs.timestamps._Timestamp.__radd__()
File timestamps.pyx:465, in pandas._libs.tslibs.timestamps._Timestamp.__add__()
TypeError: Addition/subtraction of integers and integer-arrays with Timestamp is no longer supported. Instead of adding/subtracting `n`, use `n * obj.freq`
Solution
You can sum/average ts.value
instead of summing ts
directly, and after the averaging, convert it back to a timestamp:
average = pd.Timestamp(sum(ts.value for ts in timestamps) / len(timestamps))
Full example:
import pandas as pd
# Creating an array of five fixed pd.Timestamp objects
timestamps = [
pd.Timestamp('2023-01-01 12:00:00'),
pd.Timestamp('2023-01-02 12:00:00'),
pd.Timestamp('2023-01-03 12:00:00'),
pd.Timestamp('2023-01-04 12:00:00'),
pd.Timestamp('2023-01-05 12:00:00')
]
# Result: Timestamp('2023-01-03 12:00:00')
average = sum(timestamps) / len(timestamps)
If this post helped you, please consider buying me a coffee or donating via PayPal to support research & publishing of new posts on TechOverflow