Group by consecutive values in one column and select the earliest and latest date for each group
I want to group by consecutive values in col1 and for each group, select the earliest date and hour from col2 and col3 and the latest date and hour from col4 and col5.
Here is my data:
| col1 | col2 | col3 | col4 | col5 | |
|---|---|---|---|---|---|
| 0 | a | 2021-07-03 | 17:08 | 2021-07-04 | 10:41 |
| 1 | b | 2021-07-10 | 04:14 | 2021-07-11 | 04:32 |
| 2 | c | 2021-07-13 | 02:03 | 2021-07-14 | 00:45 |
| 3 | d | 2021-07-14 | 21:23 | 2021-07-15 | 02:59 |
| 4 | d | 2021-07-15 | 04:05 | 2021-07-15 | 09:41 |
| 5 | e | 2021-07-17 | 13:50 | 2021-07-18 | 08:49 |
| 6 | a | 2021-07-18 | 10:51 | 2021-07-18 | 12:27 |
| 7 | a | 2021-07-18 | 13:55 | 2021-07-19 | 06:26 |
| 8 | f | 2021-09-20 | 22:36 | 2021-09-20 | 23:19 |
| 9 | f | 2021-09-21 | 23:45 | 2021-09-23 | 10:12 |
Expected output:
| col1 | col2 | col3 | col4 | col5 | |
|---|---|---|---|---|---|
| 0 | a | 2021-07-03 | 17:08 | 2021-07-04 | 10:41 |
| 1 | b | 2021-07-10 | 04:14 | 2021-07-11 | 04:32 |
| 2 | c | 2021-07-13 | 02:03 | 2021-07-14 | 00:45 |
| 3 | d | 2021-07-14 | 21:23 | 2021-07-15 | 09:41 |
| 4 | e | 2021-07-17 | 13:50 | 2021-07-18 | 08:49 |
| 5 | a | 2021-07-18 | 10:51 | 2021-07-19 | 06:26 |
| 6 | f | 2021-09-20 | 22:36 | 2021-09-23 | 10:12 |
Comments
Post a Comment