Using numpy.where to calculate new pandas column, with multiple conditions
I have a problem with regards as to how to appropriately code this condition. I'm currently creating a new pandas column in my dataframe, new_column
, which performs a subtraction on the values in column test
, based on what index of the data we are at. I'm currently using this code to get it to subtract a different value every 4 times:
subtraction_value = 3
subtraction_value = 6
data = pd.DataFrame({"test":[12, 4, 5, 4, 1, 3, 2, 5, 10, 9]}
data['new_column'] = np.where(data.index%4,
data['test']-subtraction_value,
data['test']-subtraction_value_2)
print (data['new_column']
[6,1,2,1,-5,0,-1,3,4,6]
However, I now wish to get it performing the higher subtraction on the first two positions in the column, and then 3 subtractions with the original value, another two with the higher subtraction value, 3 small subtractions, and so forth. I thought I could do it this way, with an |
condition in my np.where
statement:
data['new_column'] = np.where((data.index%4) | (data.index%5),
data['test']-subtraction_value,
data['test']-subtraction_value_2)
However, this didn't work, and I feel my maths may be slightly off. My desired output would look like this:
print(data['new_column'])
[6,-2,2,1,-2,-3,-4,3,7,6])
As you can see, this slightly shifts the pattern. Can I still use numpy.where()
here, or do I have to take a new approach? Any help would be greatly appreciated!
Comments
Post a Comment