2021-12-22

Rolling sum based on all previous dates NOT previous rows sorted by date

Given the following dataframe:

+------------+--------+
|    Date    | Amount |
+------------+--------+
| 01/05/2019 |     15 |
| 27/05/2019 |     20 |
| 27/05/2019 |     15 |
| 25/06/2019 |     10 |
| 29/06/2019 |     25 |
| 01/07/2019 |     50 |
+------------+--------+

I need to get the rolling sum of all previous dates as follows:

+------------+--------+
|    Date    | Amount |
+------------+--------+
| 01/05/2019 | NaN    |
| 27/05/2019 | 15     |
| 27/05/2019 | 15     |
| 15/06/2019 | 35     |
| 29/06/2019 | 10     |
| 01/07/2019 | 35     |
+------------+--------+

Using:

df = pd.DataFrame(
    {
        'Date': {
            0: datetime.datetime(2019, 5, 1),
            1: datetime.datetime(2019, 5, 27),
            2: datetime.datetime(2019, 5, 27),
            3: datetime.datetime(2019, 6, 15),
            4: datetime.datetime(2019, 6, 29),
            5: datetime.datetime(2019, 7, 1),
        },
        'Amount': {0: 15, 1: 20, 2: 15, 3: 10, 4: 25, 5: 50}
    }
)
df.sort_values("Date", inplace=True)
df_roll = df.rolling("28d", on="Date", closed="left").sum()

Gets me:

+------------+--------+
|    Date    | Amount |
+------------+--------+
| 01/05/2019 |    NaN |
| 27/05/2019 |     15 | 
| 27/05/2019 |     35 | <-- Should be 15
| 15/06/2019 |     35 |
| 29/06/2019 |     10 |
| 01/07/2019 |     35 |
+------------+--------+

Which isn't quite correct.

How would I get the sum of all previous dates rather than all previous rows?



from Recent Questions - Stack Overflow https://ift.tt/3H7qVD7
https://ift.tt/eA8V8J

No comments:

Post a Comment