2021-11-28

How to resample a df with multiple columns

I have a minute data for multiple requests. I would like to resample it to hourly and groupby the request so that i can get the total number of requests per hour

this is how the data looks like

    | RequestDate | Request | Count |
    | 2021-11-24 22:09:00 | Request 1 | 10 |
    | 2021-11-24 22:09:00 | Request 3 | 1 |
    | 2021-11-24 22:09:00 | Request 2 | 4 |
    | 2021-11-24 22:09:00 | Request 4 | 5 |
    | 2021-11-24 22:10:00 | Request 1 | 4 |
    | 2021-11-24 22:10:00 | Request 2 | 0 |
    | 2021-11-24 22:10:00 | Request 3 | 6 |
    | 2021-11-24 22:10:00 | Request 4 | 5 |
    | 2021-11-24 22:10:00 | Request 5 | 1 |

Output:

    | RequestDate | Request | Count |
    | 2021-11-24 22:00:00 | Request 1 | 14 |
    | 2021-11-24 22:00:00 | Request 2 | 4 |
    | 2021-11-24 22:00:00 | Request 3 | 7 |
    | 2021-11-24 22:00:00 | Request 4 | 10 |
    | 2021-11-24 22:00:00 | Request 5 | 1 |

I tried this but ended in an error:

    df_groupby = df.groupby(by=[df["RequestDate"].resample('h'), "Request"])
    
    df_groupby["Request"]
    
    KeyError: 'RequestDate'

df with test data can be created as follows

df = pd.read_csv("test_data.csv")

test_data.csv

RequestDate,Request,RequestCount
2021-11-18 00:00:00,Request1,4
2022-11-18 00:00:00,Request2,4
2022-11-18 00:00:00,Request3,4
2022-11-18 00:00:00,/Request4,4
2022-11-18 00:00:00,Request5,4
2021-11-18 00:01:00,Request1,4
2021-11-18 00:02:00,Request1,2
2021-11-18 00:03:00,Request2,3
2022-11-18 00:04:00,Request3,4
2021-11-18 00:05:00,Request1,4


from Recent Questions - Stack Overflow https://ift.tt/30ZiP01
https://ift.tt/eA8V8J

No comments:

Post a Comment