2021-12-21

pandas computation on rolling 1 calendar month

I have a pandas DataFrame with date as the index and a column, 'spendings'. I intend to get the rolling max() of the 'spendings' column for the trailing 1 calendar month (not 30 days or 4 weeks).

I tried to capture a snippet with custom data for addressing the problem, below (borrowed from Pandas monthly rolling operation):

import pandas as pd
from io import StringIO

data = StringIO(
"""\
date          spendings
20210325    15
20210405    20
20210415    10
20210425    40
20210505    3
20210515    2
20210525    2
20210527    1
"""
)

df = pd.read_csv(data,sep="\s+", parse_dates=True)

df.index = pd.to_datetime(df.date, format='%Y%m%d')
del(df['date'])

Now, to create a column 'max' to hold rolling last 1 calendar month's max() val, I use:

df['max'] = df.loc[(df.index - pd.tseries.offsets.DateOffset(months=1)):df.index, 'spendings'].max()

This raises an exception like:

TypeError: cannot do slice indexing on DatetimeIndex with these indexers [DatetimeIndex(['2021-02-25', '2021-03-05', '2021-03-15', '2021-03-25',
           '2021-04-05', '2021-04-15', '2021-04-25'],
          dtype='datetime64[ns]', name='date', freq=None)] of type DatetimeIndex

(I could have followed the method using list comprehension here: https://stackoverflow.com/a/47199274/235415, but I would like to use panda's vectorized method. I have many DataFrames and each is very large - using list comprehension is very slow here).

Q: How to get the vectorized method of performing rolling 1 calendar month's max()?

The expected o/p, ie primarily the 'max' column (holding the max value of 'spendings' for last 1 calendar month) will be something like this:

>>> df
            spendings  max
date                      
2021-03-25         15   15
2021-04-05         20   20
2021-04-15         10   20
2021-04-25         40   40
2021-05-05         3    40
2021-05-15         2    40
2021-05-25         2    40
2021-05-27         1     3


from Recent Questions - Stack Overflow https://ift.tt/3sh3Olk
https://ift.tt/eA8V8J

No comments:

Post a Comment