2022-09-29

How do you maintain on Python the value of the last row in a column, like on excel?

I have looked around and haven't found an 'elegant' solution. It can't be that it is not doable. What I need is to have a column ('col A') on a dataframe that it is always 0, if the adjacent ('col B') column hits 1, then change the value to 1, and all further rows should be 1 (no matter what else happens on 'col B'), until another column ('col C') hits 1, then 'col A' returns to 0, until this repeats. The data has thousands of rows, and it gets updated regularly. any ideas? I have tried shift, iloc and loops, but can't make it work. the result should look something like this:

[sample data][1]

date col A col B col C
...   0     0     0
...   0     0     0
...   1     1     0
...   1     1     0
...   1     0     1
...   0     0     0
...   0     0     0
...   1     1     0
...   1     1     0
...   1     0     0
...   1     0     0
...   1     1     0
...   1     0     0
...   1     1     0
...   1     0     1
...   0     0     0

This is the base code I have been thinking about, but I can't get it to work:

df['B'] = df['A'].apply(lambda x: 1 if x == 1 else 0)

for i in range(1, len(df)):
    if df.loc[i, 'C'] == 1:
        df.loc[i, 'B'] = 0
    else:
        df.loc[i, 'B'] = df.loc[i-1, 'B']


No comments:

Post a Comment