How to calculate date difference from two columns but with different rows and a condition?
Based on the example of dataframe below, I would like to calculate difference between two datetime for certain index and its cumulative. The expected results are as in the column diff_days
and cum_diff days
index | date_a | date_b | diff_days | cum_diff_days |
---|---|---|---|---|
1 | 1/1/2023 | NaT | NaT | - |
1 | NaT | NaT | NaT | - |
1 | NaT | 3/1/2023 | 2 | 2 |
2 | 4/1/2023 | NaT | NaT | - |
2 | NaT | NaT | NaT | - |
2 | NaT | 6/1/2023 | 2 | 4 |
3 | 7/1/2023 | NaT | NaT | - |
3 | NaT | 8/1/2023 | 1 | 5 |
3 | 9/1/2023 | NaT | NaT | - |
3 | NaT | NaT | NaT | - |
3 | NaT | 11/1/2023 | 2 | 7 |
I have checked the other post where it calculates the difference between two dates, unfortunately that one is when the date is in the same row. For my case, I wanted to understand how to calculate the dates if it's on different rows at different column since just subtracting it with df['diff_days']
= df['date_a']
- df['date_b']
will produce aNaT
results. I really appreciate if there is someone enlighten me on this problem.
Comments
Post a Comment