Remove mutually exclusive records
What would be the fastest way to remove mutually exclusive record in the record set. I'm using Python and can leverage Pandas if needed.
I have following data:
Record ID | Shared On (UNIX timestamp) |
Share type | Share To User |
---|---|---|---|
1 | 1611872850 | shared | user A |
2 | 1611872851 | shared | user B |
3 | 1611872852 | shared | user B |
1 | 1611872853 | share_removed | user A |
3 | 1611872854 | share_removed | user B |
4 | 1611872855 | shared | user C |
1 | 1611872856 | shared | user A |
2 | 1611872857 | share_removed | user B |
1 | 1611872858 | share_removed | user A |
As we see that for example, record 1 was shared to user A, then removed from User A, then shared again, and then removed, hence no records should exist.
Output should be only one row:
Record ID | Shared On | Share type | Share To User |
---|---|---|---|
4 | 1611872855 | shared | user C |
One option is to use a dictionary and then remove the record if mutually exclusive record with they key and share type exist in the record. Key might be something like [recordId]-[shareType]-[sharedToUser]
I was thinking maybe there is similar functionality already exists somewhere (in Pandas?)
from Recent Questions - Stack Overflow https://ift.tt/3t8Yxdm
https://ift.tt/eA8V8J
Comments
Post a Comment