2022-07-18

Group by consecutive values in one column and select the earliest and latest date for each group

I want to group by consecutive values in col1 and for each group, select the earliest date and hour from col2 and col3 and the latest date and hour from col4 and col5.

Here is my data:

col1 col2 col3 col4 col5
0 a 2021-07-03 17:08 2021-07-04 10:41
1 b 2021-07-10 04:14 2021-07-11 04:32
2 c 2021-07-13 02:03 2021-07-14 00:45
3 d 2021-07-14 21:23 2021-07-15 02:59
4 d 2021-07-15 04:05 2021-07-15 09:41
5 e 2021-07-17 13:50 2021-07-18 08:49
6 a 2021-07-18 10:51 2021-07-18 12:27
7 a 2021-07-18 13:55 2021-07-19 06:26
8 f 2021-09-20 22:36 2021-09-20 23:19
9 f 2021-09-21 23:45 2021-09-23 10:12

Expected output:

col1 col2 col3 col4 col5
0 a 2021-07-03 17:08 2021-07-04 10:41
1 b 2021-07-10 04:14 2021-07-11 04:32
2 c 2021-07-13 02:03 2021-07-14 00:45
3 d 2021-07-14 21:23 2021-07-15 09:41
4 e 2021-07-17 13:50 2021-07-18 08:49
5 a 2021-07-18 10:51 2021-07-19 06:26
6 f 2021-09-20 22:36 2021-09-23 10:12


No comments:

Post a Comment