0

My problem is to apply a multiplication factor K=0.5 starting at the first number of a set of numbers that repeats only in certain columns [Bird1 and Bird2]. This factor will be applied to the previous value calculated UNTIL the last value that repeats.

TABLE A:

    Bird1  Bird2  Bird3 
    
     100  50      200  
     50   40      100  
     40   40      80  
     40   80      200  
     40   50      200 
     40   90      200 
     100  12      40 

The result should be as per the table below. How to implement this code in python?

TABLE B:

 Bird1     Bird2     Bird3 

   100       50        200  
   50        40        100  
   40        40        80  
 **20**      80        200  
 **10**      50      **100** 
 **5**       90      **50** 
  100        12        40

Using the df.interpolate() command is not suitable because it uses all the values ​​of the repeating lines. I would just like a constant K starting and being applied to the first value, and that this multiplication is repeated in the next values, until the last value of the repeating line.

One strategy I used is to use the df.ne(0) command to compare the lines and check if they have the same value, but I'm having trouble implementing it.

gveronese
  • 35
  • 5

1 Answers1

1
import pandas as pd
from io import StringIO

df = pd.read_csv(StringIO(',Bird1,Bird2,Bird3\r\n0,100,50,200\r\n1,50,40,100\r\n2,40,40,80\r\n3,40,80,200\r\n4,40,50,200\r\n5,40,90,200\r\n6,100,12,40\r\n'
))
k = 0.5

First, find where the current value equals the next value:

a = (df.shift(1) == df) != 0

   Bird1  Bird2  Bird3
0  False  False  False
1  False  False  False
2  False   True  False
3   True  False  False
4   True  False   True
5   True  False   True
6  False  False  False

Then with some magic from this answer, Count upwards on True and reset counter on False:

k_power = (a.cumsum()-a.cumsum().where(~a).ffill().fillna(0).astype(int))

   Bird1  Bird2  Bird3
0      0      0      0
1      0      0      0
2      0      1      0
3      1      0      0
4      2      0      1
5      3      0      2
6      0      0      0

Raise k to the power of that dataframe:

multiplier = k ** k_power

   Bird1  Bird2  Bird3
0  1.000    1.0   1.00
1  1.000    1.0   1.00
2  1.000    0.5   1.00
3  0.500    1.0   1.00
4  0.250    1.0   0.50
5  0.125    1.0   0.25
6  1.000    1.0   1.00

Multiply your dataframe by these values:

df * multiplier

   Bird1  Bird2  Bird3
0  100.0   50.0  200.0
1   50.0   40.0  100.0
2   40.0  *20.0*  80.0
3  *20.0*  80.0  200.0
4  *10.0*  50.0 *100.0*
5  * 5.0*  90.0 * 50.0*
6  100.0   12.0   40.0

I wasnt sure what you meant by only certian columns, so I applied the logic to all three columns.

Tom McLean
  • 5,583
  • 1
  • 11
  • 36